Wednesday, April 6, 2016

Search Challenge (4/6/16): When you want just the headlines...

Sometimes... 

... when you do a search, you want to look just in certain regions of the document. 

The LA of dreams

For instance, a newspaper article usually has a title, a first paragraph, an author, a publication name, and a publication date.  (Example: headline:"LA City Council wise gun-safety action,"  published-by: Los Angeles Times, date: Aug 10, 2015.)  Being able to search just within one of those kinds of fields would be a boon to searchers.  

In other words, news articles (and document in general) have some kinds of metadata that you can search on, all of which gives you a very fine grain search ability.  

This week, a reporter wrote to me to ask if there was some way to search through newspapers in his home town for any headlines about a particular topic.  

What he wanted to do was to get a sense for how much news coverage a particular topic received over time.  Did the news in his town really cover the topic?  Or did they just let it slide?  How could you tell?  His idea was to look at the headlines, and count up how often the topic had been written about.  

I realized today that knowing how to do this is a valuable skill for SearchResearchers, and hence, it makes a great Challenge.  Here's this week's Challenge, modified slightly to protect the person who suggested the idea.

Can you figure out how to do this? 

As you know, I'm originally from Southern California, Los Angeles to be exact, so I'm always curious about what's going on there.  

Suppose I'm a reporter trying to understand how the Los Angeles City Council deals with gun-related issues.  Can you (expert SearchResearchers) tell me how to do the following? 

1. Can you search the major news outlets in the Los Angeles (LA) region for news articles over the past year that report on the City Council considering any kind of gun-related actions?  (Be generous here--if the council heard a report about the use of guns, that would count.)  

2. (Harder) Can you find the top 100 LA City Council headlines on guns, and then extract the publication dates to create a week-by-week histogram of when these articles were published?  (This is a two-step challenge: (a) find and extract the dates, (b) put the dates into a spreadsheet and create a histogram showing the number of publications on this topic by week.)  

Can you figure this one out?  Don't worry if you can't figure out how to do part 2--I'll show you how I did it next week.  

I can think of at least two ways to do the headline-search, but I'm curious HOW you figured out how to do it.  Would you please let us know as you write up your answer in the comments below?  

Search on! 


8 comments:

  1. Good Day, Dr. Russell. The Challenge as usual sounds so interesting and very helpful. Never thought about searching only headlines and news metadata.

    I'll SearchReSearch and return with my findings

    ReplyDelete
    Replies
    1. Good day, Dr. Russell and everyone.

      [news headlines metadata]

      [news search by headlines]

      News Lookup Looks promising but need to understand how to search on it

      Up Close: Using The “News Keywords” Tag For Google News

      There is a headline "“Escrowyou too, judge,”." Tried searching that news but still nothing. Just found it with ["Escrowyou too, judge" site:Nypost.com]

      I think I need a way to use "News Keywords" Tag

      [news keyword search]

      Also trying [allintitle: "Los Angeles" "City Council" gun location:"Los Angeles"] in Google News. I think this could work and has some issues, for example, knowing wich news is unique. With this query just 4 results so I think this is not the answer.

      Now, I'll try with Jon's way.

      Delete
  2. Dan, Past year meaning: Search uses that phrase to mean 2015. Do you mean that or past 12 months ? jon

    ReplyDelete
    Replies
    1. Either is fine. (I think the time from March 2015 - March 2016 is probably simplest...)

      Delete
  3. I found this which explains how to fulfill the requirements of The Challenge; I think.

    http://www.lunametrics.com/blog/2014/11/24/schema-metadata-google-tag-manager/

    Extracting Schema & Metadata With Google Tag Manager

    This will totally use my brain cell

    jon

    ReplyDelete
  4. This search is the very techniques I need but was unable to generate on my own.

    ReplyDelete
  5. google search news metadata

    http://www.lunametrics.com/blog/2014/11/24/schema-metadata-google-tag-manager/

    http://www.information-age.com/technology/information-management/2103988/the-metadata-strategy-behind-news-search-service-factiva
    This would be perfect except for the $250 per month cost.
    ===========================================================================

    [gun OR guns "los angeles city council" OR "la city council"] then filter results for time Mar 15, 2015 - Mar 15, 2016

    Works well and finds Guns n Roses and radar guns along with the LA council.


    https://developers.google.com/custom-search/docs/structured_search#colorization

    https://www.google.ca/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=how%20to%20make%20a%20google%20histogram

    After a bunch of clicking I found myself back where we were a couple of years ago. I see my exercises there. But now, I have no idea how I did that stuff.

    Is there really no interest in this topic by the regulars ?

    Cheers jon

    ReplyDelete
    Replies
    1. Hello Jon! I tried some queries and some ways and didn't find anything that can solve the Challenge.

      This is something that looked good

      [news headlines metadata]

      [news search by headlines]

      News Lookup Looks promising but need to understand how to search on it

      Up Close: Using The “News Keywords” Tag For Google News

      There is a headline "“Escrowyou too, judge,”." Tried searching that news but still nothing. Just found it with ["Escrowyou too, judge" site:Nypost.com]

      I think I need a way to use "News Keywords" Tag

      [news keyword search]

      Also trying [allintitle: "Los Angeles" "City Council" gun location:"Los Angeles"] in Google News. I think this could work and has some issues, for example, knowing which news is unique. With this query just 4 results so I think this is not the answer.

      I already want to read Dr. Russell solution. I am sure many more like me don't have idea of how to solve it.

      Enjoy day

      Delete