Simple node.js script to import Wikipedia XML dump into MongoDB database.
This is just a simple node.js script to import Wikipedia XML dump into MongoDB database.
0.10.18
+0.2.0
+1.3.19
+0.2.3
+ (can be remove and use only stdout)2.2
+Wikipedia XML dump file (uncompressed)
http://dumps.wikimedia.org
{
title: string,
ns: string,
id: number,
revision: {
id: number,
parentid: number,
timestamp: date,
contributor: {
username: string,
id: number,
ip: string
},
comment: string,
text: string,
sha1: string,
model: string,
format: string
}
}
node app.js db
dump
drop
Arguments:
db: MongoDB database
dump: Wikipedia dump XML file (uncomressed)
drop: Drop pages collection (if exists) before insterting new documents
Example:
node app.js 'mongodb://localhost:27017/wiki' '/media/Data/enwiki.xml' drop
This project is BSD (2 clause) licensed.