Researchers studying underground forums for abuse and cybercrime often use supervised machine learning for post classification, but human annotation is expensive. This study introduces a methodology for generating stratified samples based on forum centrality properties, finding that a uniform distribution of post degree centrality improves recall by 30% while maintaining precision. The research also reveals that classifiers trained on similar samples can have up to 33% disagreement in classifying criminal activities when applied to the entire forum.