Welcome to the New Andrew Bailey

Hello and welcome to my blog. If you've been here before, things might look a little different, especially if you came in through the homepage. I have implemented a few things I have gathered by doing research for my podcast, and several hours of toying around.

A screenshot of the Andrew Bailey as it is right now.

I had an idea a few weeks ago. I often do sorta-reviews on videogames, and they are always accompanied by a screenshot of sorts. On my podcast a long time ago, I came across an article that explained how you could put text on any image and have everything look good. I decided to apply that to my images and put the article title on it near the bottom. I combined this with an idea I stole while looking at Polygon: fade out a paragraph below it. I whipped up a prototype and loved it. After I did the change for real, a graphic designer at work thought it was a clever replacement for ellipses.

The only problem is pulling out the image and paragraph. I decided to build some regex that would pull out the first image and paragraph. Even though you shouldn't use regex on HTML, the world isn't pretty, and you must resort to hacks sometimes. This ended up not being that hard, since I author my posts in markdown, and markdown output is predictable, generally. For posts that do not have an image, simply display the first paragraph.

There is an unexpected upside to this. Since the entire article isn't displayed with several others, it permits more than one image in my posts, and keep page sizes down. And I guarantee that only the first one will show with the others.

The other big change is search. You can see a text box on the sidebar there. I have used PostgreSQL's full text search feature for this. It ended up being somewhat simple to implement. Add a column to your table. If you want this row to appear in the search, the query must match something in this column.

ALTER TABLE myDatabase.myTable ADD COLUMN mySearchData SET DATA TYPE tsvector USING 
    setweight(to_tsvector(COALESCE(myTitle,'')), 'A') || setweight(to_tsvector(COALESCE(myArticleContent,'')), 'B') || setweight(to_tsvector(COALESCE(myDescription,'')), 'B');

CREATE INDEX mytable_search_index ON myDatabase.myTable USING gin(mySearchData);

This adds a column and uses existing data to populate it. COALESCE is handy to make sure nulls won't accidentally invade your data. to_tsvector is where magic happens; it trims a bunch of words into a list of single forms. For example, it turns words like 'games', 'gamed', 'gaming', and 'gamer' into 'game'. setweight does what it says on the label: makes things more or less important. You can only give it four levels: A, B, C, or D.

For performance reasons, create an index on this column.

To update this column without thinking about it, create a trigger. Use the same formula you used when setting up the column:

CREATE OR REPLACE FUNCTION myDatabase.populate_mytable_search_index() RETURNS trigger AS $$
begin  
    new.mySearchData := setweight(to_tsvector(COALESCE(new.myTitle,'')), 'A') || setweight(to_tsvector(COALESCE(new.myArticleContent,'')), 'B') || setweight(to_tsvector(COALESCE(new.myDescription,'')), 'B');
    return new;
end $$ LANGUAGE plpgsql;

CREATE TRIGGER index_myTable BEFORE INSERT OR UPDATE ON myTable FOR EACH ROW EXECUTE PROCEDURE myDatabase.populate_mytable_search_index();

And now the important part: query the column:

SELECT * FROM myDatabase.myTable mt, plainto_tsquery('MY_QUERY') query WHERE query @@ mt.mySearchData ORDER BY ts_rank_cd(mt.mySearchData, query) DESC, my.posted DESC;

plainto_tsquery (and its more complicated brother, to_tsquery) constructs an object that looks at all those tsvectors that you made earlier. And what's a search without some sort of sorting? ts_rank_cd does that. As expected, it's heavily influenced by the setweight values that go into the column. And since I've noticed that many of my articles have identical rank, I also sort by posted date. It's why my "Borderlands 2" article appears before the original "Borderlands" one, and this one will appear, but below them.

I love features like this that use a different tool from a toolbox that I already have! It's not perfect, but I think it's awesome already. I intend on messing with it over time. For example, it will show plenty of results for "borderlands", but nothing for "border lands".

I hope you enjoy these new goodies. Happy reading!

Posted under Programming.

No new comments may be posted for this article at this time.