Tha new things
What can this script do? What is it?This is a PHP script that is used to parse the DMOZ RDF data dump files. For more information about these files visit http://rdf.dmoz.org/.
Current features of this script include:
Step 1Open config.php and edit it.
Step 2Run create_tables.php to create the tables in your database.
Step 3Run start_script.php from the prompt (i.e. php start_script.php)
Abracadabra, the script handles the dirty work for you :) Just lay back and relax - and smoke a joint (nah bad idea hehe).
AdditionalRun drop_tables.php to delete the tables in your database.
Cheers,
Amir Salihefendic, amix@amix.dk
PS: If you have some troubles or find some bugs - - - post them on https://sourceforge.net/projects/dmoz2mysql/
THANKS!17. dec. 2003set_time_limit(0); that sets maximum execution time to none
2. feb. 2004Fixed a MAJOR bug (a catid bug that gave almost all catid 0!). A big thanks goes to Murray Woodman and Tony Spencer for reporting this bug.Moved the querys for creating tables out of start_script.php (too many people had problems with them)! To create your tables you need to run create_tables.php.
12. feb. 2004This is a major update :) It fixes some very, very nasty bugs. It adds some new features - plus some little tweaks here and there!
Fixed some MAJOR bugs (bugs that made catid's turn to 0!) I have updated the code to extract the DMOZ data dump files - some users had problems, and now I have updated the code. I have tested it on Windows and Mac OS X and it worked fine.
A new features is that you control the script from config.php.
Well, have fun :) I hope the script works fine! PS: I have updated all "" with '' (where I could) - I really don't hope it gives some problems, if you find an error, then email amix@amix.dk.
13. feb. 2004Again a major update :) I haven't realesed version 2.0 since it had some problems. It didn't work since it should use LOT's of memory! Now I have fixed that error by making a new class that splits the RDF file into small files (25 MB). This makes it easy to load them into the memory - and now the script works WITHOUT any bugs ;-) [I have tested it for hours now ehehe].
18. feb. 2004Fixe a little bug in class_parse.php that didn't split the content file.