Contribute

Machine learning algorithms, such as the one ApriBot uses, need to be trained on labelled data: that is, posts which have been manually classified (by experts—yes, that's you!) as being either Aprimon-related or not.

So far:

ApriBot's current algorithm is based on XGBoost, and has quite respectable performance: it achieves an accuracy of approximately 96% on unseen posts, or an F1 score of 93%. Generally, this accuracy doesn't seem to improve with more data: so, there's no urgent need to label more posts. However, if you want to, you can still do so—I will be very grateful, and there is a possibility that a different algorithm may be able to make use of the extra data!

Please log in with Reddit (in the top-right corner) to continue.