This paper addresses the challenge of extracting structured data from unstructured web forum pages, focusing on extracting information such as post titles, authors, timestamps, and content. It introduces a method using Markov Logic Networks (MLNs) that integrates both page-level and site-level knowledge to improve extraction accuracy. Experiments on 20 forums show that incorporating site-level knowledge significantly enhances both precision and recall compared to using only page-level knowledge.