SELECT * FROM t1 WHERE title REGEXP '(this[[.space.]{1,}]rocks)'

Trying to find ways of optimizing my code that sorts the jobs, I remembered reading about regex in MySQL. I can make my way round regex, but its not my strongest point so never took much notice. Also not having a need for it, I filed it in to-do.

The problem: I have 15k new jobs pulled from the RSS feeds. Now I want to know what the pay is like for each one. Not all jobs have salaries in them, so its a waste of ram looping through the entire database. But if you could just get the jobs that do have a salary you would be saving quite a lot of processing.

My first thought was this is cool, ill just add a CakePHP virtual field and get the salaries on the fly, but unfortunately MySQL can only match. It cant pull the match out of the data. So I had to pull the whole record and use Php's preg_match() to get the actual salary.

After playing with the way it works and fine tuning my query in PhpMyAdmin I turned to CakePHP's Model::find() only to find that it does not like the regex. I did not play with it to much, but it was just stripping it out of the find completely. When I have a little more time I will dig into the source and see why it does not like it.

No that I am not pulling all the records, processing the data is much quicker. When I have it all working nicely ill slot the logic into the Model::beforeSave() so that it all happens magically.

Read more...

No more posts