Liang Gong
unread,
Apr 7, 2015, 11:41:23 AM4/7/15
to
I need search for the ip address type of string (like 192.168.0.1) a lot in my text fields. It seems that Mongodb is extremely slow in such operations. If i remove the dots from the original text (192168000001) and search for the new string, I can tell the performance improvement. My theory is the dots causes the ambiguity to the search engine which leads to such slowness.
I wonder if there are any solutions other than removing the dot?
Tim Hawkins
unread,
Apr 7, 2015, 11:54:49 AM4/7/15
to
You could convert the ipaddresses into 32bit unsigned integers, which is what they represent. Once indexed you can then perform fast queries.
Once in integer format then performing network searchs for things like "192.168.*" would involve calculating the lowest and highest ipaddress in the range and using it with $gte and $lte queries.
There are a number of libraries in various languages for performing the parsing and conversion to and from integers.
For example the php pear library //pear.php.net/manual/en/package.networking.net-ipv4.php
I need search for the ip address type of string (like 192.168.0.1) a lot in my text fields. It seems that Mongodb is extremely slow in such operations. If i remove the dots from the original text (192168000001) and search for the new string, I can tell the performance improvement. My theory is the dots causes the ambiguity to the search engine which leads to such slowness.
I wonder if there are any solutions other than removing the dot?
Liang Gong
unread,
Apr 7, 2015, 12:17:20 PM4/7/15
to
That was my original solution which was denied by the boss. His decision was based on the facts that the data in MongoDb are collected from different data source and it is a management nightmare to do such code changes in all the places. My bottom line solution will be use a "trigger" type solution in MongoDb to transfer the data after they were inserted.
I hope if there are so relatively simpler solution on query end such as concatenate filters 192, 168.
David Alexander
unread,
Apr 8, 2015, 12:50:47 PM4/8/15
to
I suspect the problem may be in your regexp. From the sounds of things you already have a suspicion about
the meaning of the dots.
It would be helpful to have the offending line of code... but without that here is my best guess.
You need to escape the dots otherwise they match any single character. Note this is a generic regexp - I haven't checked it for MongoDB syntax compliance,
/[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}/
You may have to add further validation (start and end sentinels, restricting the range of allowed values - it depends upon your application).
Liang Gong
unread,
Apr 10, 2015, 9:39:49 AM4/10/15
to
There is no regular expression involvement at all. it is a common full text search such as db.XXXX.find({$text:{$search:"192.168.0.1"}})
David Alexander
unread,
Apr 10, 2015, 1:31:52 PM4/10/15
to
Now that you have asked a specific question, I can give a specific answer...
From
$search Field
In the $search field, specify a string of words that the text operator parses and uses to query the text index. The text operator treats most punctuation in the string as delimiters, except a hyphen - that negates term or an escaped double quotes \" that specifies a phrase.
Phrases
To match on a phrase, as opposed to individual terms, enclose the phrase in escaped double quotes (\"), as in:
If the $search string includes a phrase and individual terms, text search will only match the documents that include the phrase. More specifically, the search performs a logical AND of the phrase with the individual terms in the search string.