Parsing HTML in PHP using Simple HTML DOM

There is a Mobile Optimized version of this page (AMP). Open Mobile Version.

Being able to parse HTML with PHP is very important if you need to scrape data from a website or add/remove parts of a html document.

Fortunately this is extremely easy with Simple HTML DOM, this 46KB include is a miracle script that enables you to read HTML files into an object which you can then step through as you please, there are functions that allow you to find various tags by type, class and/or id. Better still is that these functions work very similar to jQuery making it really easy and pleasant to use.

Below is a very simple example of how to use Simple HTML DOM:

<?php
$url = 'http://url';

$html = file_get_html($url);

//Get all data inside the <div> where class = "pricing"
foreach($html->find('div[class=pricing]')->outertext as $data) {
	echo $data->outertext;
}
?>

 

And here is a more advanced example showing a select and then another select on the selected data in a for loop:

<?php
$url = 'http://url';

$html = file_get_html($url);

//Get all data inside the <tr> of <table class="results">
foreach($html->find('table[class=results] tr') as $tr) {
	//get all <td> where class is NOT "debug"
	foreach($tr->find('td[class!=debug]') as $t) {
		//get the inner HTML
		$data = $t->outertext;
	}
}
?>

You can download Simple HTML DOM from SourceForge.

Author: Dean Williams

I'm a Web Developer, Graphics Designer and Gamer, this is my personal site which provides PHP programming advice, hints and tips

Post Tags:
, , , , , , ,