A LINQ provider for Web queries
To start a series of "LINQ provider" posts, today I upload a provider sample that in some sense treats the Internet as a database: For a SQL Server database, you can make tables in a database accessible to LINQ by writing classes with attributes that define how objects of these classes are retrieved from rows in tables. LINQ can then use these classes to issue queries against the database. Similarly, this provider allows adding attributes to classes to specify how such objects are retrieved from Web pages, and you can then issue LINQ queries against them.
The project "WebLinq" in the attached solution contains this provider - it is not very sophisticated, it just contains three files:
- WebLinqAttributes.cs contains the attributes that are recognized
- WebContext.cs is the class your WebLinq enabled classes inherit from
- Utils.cs contains helper functions to GET / POST to a web site and to find substrings in a text.
The project "WebSources" defines some classes for
- Searching for articles in the CiteSeer web sites (see below)
- Searching for articles in the MSDN web sites
- Translating words / sentences
- Integrating functions of one variable
- Looking up the current values of stocks from the company symbol
The project "SimpleDemos" uses these two DLLs to demonstrate the last three classes.
The project "TestWebLinq" demonstrates the access to the CiteSeer web sites.
CiteSeer is a database of computer science articles; you can search for articles by keywords, and obtain information about articles, and often even retrieve them directly from the Web site.
To use the CiteSeer demo, enter for example "Support Vector Machines" in the text box labeled "Search terms", and click on the "Retrieve" button. It will take some while to visit the web pages which list available articles, to visit the web page for each article, retrieve the information from this article, and access a another web page for details, but then you should see a list of paragraphs which contain
- Author's name(s)
- Title and year
- Some three lines of introduction
- URL for this article
- URL for downloading the article as pdf file
- Information about the rights for this article
If you are only interested in new articles, try entering 2002 in the "Publication year >=" text field and click again on "Retrieve" (currently I get 3 results back).
Here is how the corresponding query looks in the code:
var doc = new GoogleCiteSeer(searchTerms,0);
var query = from art in doc.Articles
where art.details.Document != null
&& art.details.Document.bibtex != null
&& art.details.Document.bibtex.year>=minYear
select art.details;
Here is an example for a class that defines how to read the "BibTeX" part of the Web page with details for an article:
public class CsBibTex {
[StartPart("author = \"")] [EndPart("\"")] public string author;
[StartPart("title = \"")] [EndPart("\"")] public string title;
[StartPart("year = ")] [EndPart(",")] public int year;
}
This sample code is provided as-is and does not come with any warranty.
You can modify and use the code for commercial and non-commercial purposes.
Comments
Anonymous
September 08, 2006
The comment has been removedAnonymous
February 28, 2008
Here are some useful links to LINQ information. Use the comments or write me if you want to add to thisAnonymous
February 28, 2008
I've recently updated the list of LINQ Providers found on my Links to LINQ page, accessible from theAnonymous
February 29, 2008
The comment has been removedAnonymous
March 02, 2008
PingBack from http://www.hecgo.com/2008/03/03/linq-to-everything-a-list-of-linq-providers/Anonymous
March 18, 2008
I mentioned in a post a little while ago about the various LINQ To projects I had seen, but Charlie CalvertAnonymous
March 22, 2008
LINQ Providers LINQ to Amazon LINQ to Active Directory LINQ over C# project LINQ to CRM LINQ To GeoAnonymous
March 22, 2008
LINQ Providers LINQ to Amazon LINQ to Active Directory LINQ over C# project LINQ to CRM LINQ To GeoAnonymous
March 27, 2008
PingBack from http://www.jacquessnyman.co.za/?p=20Anonymous
April 09, 2008
PingBack from http://blog.windows2.webhome.at/post/2008/04/LINQ-to-AnyWhere.aspxAnonymous
April 22, 2008
PingBack from http://blog.web-crossing.com/post/2008/04/LINQ-to-AnyWhere.aspxAnonymous
September 19, 2008
Here are some useful links to LINQ information. Use the comments or write me if you want to add to thisAnonymous
November 11, 2008
Офіційні: LINQ to SQL (DLINQ) LINQ to XML (XLINQ) LINQ to XSD LINQ to Entities BLINQ PLINQ НеофіційніAnonymous
November 17, 2008
Офіційні: LINQ to SQL (DLINQ) LINQ to XML (XLINQ) LINQ to XSD LINQ to Entities BLINQ PLINQ НеофіційніAnonymous
November 29, 2008
PingBack from http://vincenthomedev.wordpress.com/2008/11/29/a-list-of-linq-providers/Anonymous
April 26, 2009
This weekend I’ve built a small application, which queries the “Simpsons” seasons guide data and updatesAnonymous
June 01, 2009
PingBack from http://paidsurveyshub.info/story.php?id=73627