Bing Translate YQL Table

With Yahoo’s babel fish translate webservice gone and Google’s translate service requiring payment there leaves only one free translate API around. Bing Translate. It’s not the easiest API to interface with, but using YQL and the execute block makes it easy. Bing Translate generously gives you 2 million characters a month – for free. Any thing more then that will cost you. (*I will add this table to github.com/yql shortly)

<?xml version="1.0" encoding="UTF-8"?>
<table xmlns="http://query.yahooapis.com/v1/schema/table.xsd">
    <meta>
        <author>Paul Donnelly (@pjdonnelly)</author>
        <description>Get your Bing translate keys here: https://datamarket.azure.com/developer/applications/</description>
        <documentationURL>http://msdn.microsoft.com/en-us/library/ff512421</documentationURL>
        <sampleQuery>select * from {table} where text="hello world" and client_id="" and client_secret=""</sampleQuery>
    </meta>
    <bindings>
        <select itemPath="" produces="XML">
            <urls>
                <url></url>
            </urls>         
            <inputs>
                <key id="text" type="xs:string" paramType="query" required="true" />
              	<key id="from" type="xs:string" paramType="query" default="en" />
              	<key id="to" type="xs:string" paramType="query" default="es" />
 
                <key id="client_id" type="xs:string" paramType="query" required="true" />
              	<key id="client_secret" type="xs:string" paramType="query" required="true" />
              	<key id="scope" type="xs:string" paramType="query" default="http://api.microsofttranslator.com" />
              	<key id="grant_type" type="xs:string" paramType="query" default="client_credentials" />
            </inputs>
            <execute>
                <![CDATA[
              		var params = "client_id="+client_id+"&client_secret="+client_secret+"&scope="+scope+"&grant_type="+grant_type;
              		y.log(params);
 
              		var resp = y.rest("https://datamarket.accesscontrol.windows.net/v2/OAuth2-13/").accept('application/json').contentType("application/x-www-form-urlencoded").post(params).response;
              		var MSTranslate = y.rest("http://api.microsofttranslator.com/v2/Http.svc/Translate?text="+text+"&from="+from+"&to="+to).header("Authorization","Bearer " +resp.access_token).get().response;
 
              		response.object = MSTranslate;
                ]]>
            </execute>
        </select>
     </bindings>
</table>
Share:
  • E-mail this story to a friend!
  • del.icio.us
  • Yahoo! Buzz
  • TwitThis
  • Digg
  • Facebook
  • DZone
  • Print this article!
Posted in yahoo | 2 Comments

Protected: Top movies

This post is password protected. To view it please enter your password below:


Share:
  • E-mail this story to a friend!
  • del.icio.us
  • Yahoo! Buzz
  • TwitThis
  • Digg
  • Facebook
  • DZone
  • Print this article!
Posted in yahoo | Leave a comment

Running and Cooking – a new year resolution

Last year I decided I would commit myself to trail running on a consistent basis. I’m usually pretty bad at sticking to my resolutions, but trail running did seem to stick and is something I now enjoy.  I have my friend Brad to thank for dragging me out to my first trail race in Sept 08.

As the year progressed so did my results:

  1. Angel Island (8k) – 6/20/09 – 47th Place
  2. Santa Cruz Mtns. (10k) – 9/27/09 – 56th Place
  3. Skyline Ridge (14k) – 10/04/09 – 36th Place
  4. Santa Monica Mtns. (9k) – 11/22/09 – 28th Place

For 2010 I plan to race the mid distance runs ~17-24k.

Also for 2010, I plan to cook more! As a passive fan of smittenkitchen.com I finally made something from their wonderful website – home made tomato sauce with onion and butter. It came out pretty damm good.

I bookmarked a few more things to try out:

And the master list.

Share:
  • E-mail this story to a friend!
  • del.icio.us
  • Yahoo! Buzz
  • TwitThis
  • Digg
  • Facebook
  • DZone
  • Print this article!
Posted in food, running | Tagged , | Leave a comment

YQL and JSONP-X (aka. json-p-x, jsonpx, json-px)

With all the buzz about YQL’s new Insert/Update/Delete, a new feature JSONP-X was also released at the same time.

JSONP-X is essentially an escaped XML string as a JSON result wrapped in a javascript callback function. To access this functionality consider this example:
http://query.yahooapis.com/v1/public/yql?q=<my yql query>&format=xml&callback=mycallback

and a basic structure:

myJSCallbackFunction({
    "query": {yql meta data here},
    "results": ["<escaped xml/html here>"]
});

The power of YQL’s JSONP-X really comes into play when page scraping a website. It allows you to extract HTML – keeping the HTML structure in the JSON results and using Javascript to innerHTML the results into your webpage. This makes badging much easier. (YQL and Pipes respect robots.txt so html scraping will only work on sites that are happy to have their content indexed by search engines and cached elsewhere.)

I’m a cycling fan and the Tour de France is the World Series of cycling. So I wanted to create a badge that leveraged the new JSONP-X feature to extract the nice results module on http://www.letour.fr/us/homepage_courseTDF.html

For the impatient, here is the final example page: http://paul.donnelly.org/demos/YQL_JSONP-X.html

First of all page scraping www.letour.fr to get this module wasn’t exactly straight forward. Upon further inspection of this page, that module is created dynamically based on the current stage.

select * from html where url=”http://www.letour.fr/us/homepage_courseTDF.html” and xpath = ‘//div[@id="maillotDyn"]‘

The above query yielded an empty div.

Upon further poking of the page I found that this function:

maillotFunc = function(){
	makeRequest(prefixPath + 'blocPorteursMaillots.html?'+timestamp, 'maillotDyn', 'HTML', true, false);
}

created the module.

Now that I found the function I knew the html page that I needed to scrape.

But what is the “prefixPath”. Apparently this was generated dynamically on the front page and was defined in the Javascript. I could create a YQL Execute statement that regex’s that script node or…wait..

I also noticed that href paths to various links had the dynamically created “prefixPath” as well, for example:

<li class="level2"><a href="/2009/TDF/LIVE/us/500/classement/index.html">Standings</a></li>

Ah, yes I can use that path to construct “http://www.letour.fr/2009/TDF/LIVE/us/500/blocPorteursMaillots.html” the final endpoint.

OK, so lets create a YQL query that fetchs me one of those links:

select * from html where url=”http://www.letour.fr/us/homepage_courseTDF.html” and xpath = “//li[@class='level2'][2]/a”

Great, now I have a nice result that gives me my prefix. So now, how do I go about regexing that path out and construct the final and complete URL that I need? I guess I’ll have to create a YQL execute statement that performs the regex. But wait. I’m feeling kind of lazy this morning and don’t want to spend alot of time on this.

I can use Yahoo Pipes to leverage my regex! Get the cleaned up results from Yahoo Pipes as JSON and then do my final JSONP-X call. Check out the Pipe here.

In my Pipe, I use the YQL module to get the prefixPath from the A tag. I then use the Regex module to construct the final URL I want YQL to scrape. (In [item.href] Replace [^(.*?)/classement.*] with [http://www.letour.fr$1/blocPorteursMaillots.html])

Sweet. I then can use http://pipes.yahoo.com/pipes/pipe.run?_id=KBC0Ye1r3hGpf2CaqevxTA&_render=json as a way to get to my final URL via a sub select.

This is the final YQL statement I used: select * from html where url in (select href from json where url=”http://pipes.yahoo.com/pipes/pipe.run?_id=KBC0Ye1r3hGpf2CaqevxTA&_render=json” and itemPath = “json.value.items”) and xpath = “/html/body/div”

and this is the JSONP-X call: http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%20in%20%28select%20href%20from%20json%20where%20url%3D%22http%3A%2F%2Fpipes.yahoo.com%2Fpipes%2Fpipe.run%3F_id%3DKBC0Ye1r3hGpf2CaqevxTA%26_render%3Djson%22%20and%20itemPath%20%3D%20%22json.value.items%22%29%20and%20xpath%20%3D%20%22%2Fhtml%2Fbody%2Fdiv%22&format=xml&callback=phoningHome

and the JSONP-X structure:

phoningHome({
    "query": {
        "count": "1",
        "created": "2009-07-10T07:09:52Z",
        "lang": "en-US",
        "updated": "2009-07-10T07:09:52Z",
        "uri": "http://query.yahooapis.com/v1/yql?q=select+*+from+html+where+url+in+%28select+href+from+json+where+url%3D%22http%3A%2F%2Fpipes.yahoo.com%2Fpipes%2Fpipe.run%3F_id%3DKBC0Ye1r3hGpf2CaqevxTA%26_render%3Djson%22+and+itemPath+%3D+%22json.value.items%22%29+and+xpath+%3D+%22%2Fhtml%2Fbody%2Fdiv%22",
        "diagnostics": {
            "publiclyCallable": "true",
            "url": [{
                "execution-time": "14",
                "content": "http://pipes.yahoo.com/pipes/pipe.run?_id=KBC0Ye1r3hGpf2CaqevxTA&_render=json"
            },
            {
                "execution-time": "350",
                "content": "http://www.letour.fr/2009/TDF/LIVE/us/700/blocPorteursMaillots.html"
            }],
            "user-time": "370",
            "service-time": "364",
            "build-version": "2174"
        }
    },
    "results": ["<div id=\"maillots\">\n    <h2>Jersey holders<\/h2>\n    <noscript>\n      <div class=\"errormes\">\n        <p>Activate Javascript/Flash for the automatic refresh and\n        the display of the tabs.<\/p>\n      <\/div>\n    <\/noscript> \n    <div id=\"porteurmaillotGeneral\">\n      <ul>\n        <li class=\"jaune\">\n          <a href=\"/2009/TDF/RIDERS/us/coureurs/33.html\" onclick=\"SesameCoureur('/2009/TDF/RIDERS/us/coureurs/33.html');return false;\">CANCELLARA\n          F.<\/a>\n          <a class=\"cob\" href=\"/2009/TDF/RIDERS/us/coureurs/33.html\" onclick=\"SesameCoureur('/2009/TDF/RIDERS/us/coureurs/33.html');return false;\">SAX<\/a>\n        <\/li>\n        <li class=\"vert\">\n          <a href=\"/2009/TDF/RIDERS/us/coureurs/71.html\" onclick=\"SesameCoureur('/2009/TDF/RIDERS/us/coureurs/71.html');return false;\">CAVENDISH\n          M.<\/a>\n          <a class=\"cob\" href=\"/2009/TDF/RIDERS/us/coureurs/71.html\" onclick=\"SesameCoureur('/2009/TDF/RIDERS/us/coureurs/71.html');return false;\">THR<\/a>\n        <\/li>\n        <li class=\"apois\">\n          <a href=\"/2009/TDF/RIDERS/us/coureurs/122.html\" onclick=\"SesameCoureur('/2009/TDF/RIDERS/us/coureurs/122.html');return false;\">AUGE\n          S.<\/a>\n          <a class=\"cob\" href=\"/2009/TDF/RIDERS/us/coureurs/122.html\" onclick=\"SesameCoureur('/2009/TDF/RIDERS/us/coureurs/122.html');return false;\">COF<\/a>\n        <\/li>\n        <li class=\"blanc\">\n          <a href=\"/2009/TDF/RIDERS/us/coureurs/76.html\" onclick=\"SesameCoureur('/2009/TDF/RIDERS/us/coureurs/76.html');return false;\">MARTIN\n          T.<\/a>\n          <a class=\"cob\" href=\"/2009/TDF/RIDERS/us/coureurs/76.html\" onclick=\"SesameCoureur('/2009/TDF/RIDERS/us/coureurs/76.html');return false;\">THR<\/a>\n        <\/li>\n      <\/ul>\n    <\/div><\/div>"]
});

Like I said, letour.fr didn’t make it easy, but most websites are much easier to scrape if it’s static html.

Now the easy part. Here’s the JS source that makes the YQL JSONP-X call, parses it and innerHTML’s the escaped HTML into a div.

var sURL = "http://query.yahooapis.com/v1/public/yql?q=select%20*%20from%20html%20where%20url%20in%20%28select%20href%20from%20json%20where%20url%3D%22http%3A%2F%2Fpipes.yahoo.com%2Fpipes%2Fpipe.run%3F_id%3DKBC0Ye1r3hGpf2CaqevxTA%26_render%3Djson%22%20and%20itemPath%20%3D%20%22json.value.items%22%29%20and%20xpath%20%3D%20%22%2Fhtml%2Fbody%2Fdiv%22&amp;format=xml&amp;callback=phoningHome";
 
var transactionObj = YAHOO.util.Get.script(sURL, {
    onSuccess : function(o) {o.purge();},
    onFailure : function() {YAHOO.util.Dom.get("badge").innerHTML = "error"},
    scope     : this
});
 
var phoningHome = function(r) { //the callback function
     YAHOO.util.Dom.get("badge").innerHTML = r.results;
};

And finally, the final example page here, and source.

Footnotes:

The main trouble with this method is, you have manually copy over the CSS from the site you are scraping from if you want to render their styling. If you copy over their entire style sheet, you also want to make sure it doesn’t clash with your existing styles.

Also, it’s quite easy for the publisher you are scraping from to insert a nasty <script> tag with javascript that does malicious things to your users or page – so be wary. If you want sanitized HTML output, add the sanitize option at the end of your YQL query. As of this writing there is a bug if you want to sanitize the entire output – instead of using sanitize() use: sanitize(field=”)

If the html you are scraping from uses relative links, (most will) – I found using the <base> tag useful to ensure these links actually work -or you can regex the results and modify the links that way.

The example page I created is best used if <iframed> as a badge.

Share:
  • E-mail this story to a friend!
  • del.icio.us
  • Yahoo! Buzz
  • TwitThis
  • Digg
  • Facebook
  • DZone
  • Print this article!
Posted in yahoo | Tagged , , | 6 Comments

Top 10 YQL execute links

Some links about YQL’s new execute feature:

Posted in yahoo | Tagged , | 2 Comments

Yahoo Pipes will not be effected by the closing of Brickhouse

There has been some confusion around the net.

http://www.techcrunch.com/2008/12/09/yahoo-to-close-brickhouse-by-end-of-year/
http://twitter.com/ccarmichael/statuses/1049692691
http://twitter.com/armoredtech/status/1049651550
http://twitter.com/clintlalonde/statuses/1049663508
http://twitter.com/armoredtech/statuses/1049650384

Yahoo Pipes will not be effected by the closing of Brickhouse. Pipes is currently under the Yahoo Open Social group. So don’t worry, your Pipes will still be around.

Good luck to all the team members in Brickhouse and all of my Yahoo co-workers that got laid off today.

Posted in yahoo | Tagged , | Leave a comment

I love chill drum n’ bass.

This is one of the better chill mixes I’ve heard recently. I love this old skool stuff. Still resonates today.
(btw, the videos’ suck, the music rocks, have a listen)

Everything But The Girl – Single (Photek remix)
GrooveShark link (edit: since the youtube vids are yanked have a listen here)

Also this one is quite sick.

Everything But The Girl – Before Today (Dillinja Remix)

Posted in music | Tagged , , | Leave a comment

2-legged OAuth Javascript Function for YQL

Here is a 2-legged OAuth Javascript function that makes it easy to get YQL results into your Javascript application.

First we want to include the oauth javascript libraries obtained from http://oauth.googlecode.com.

<script type="text/javascript" src="http://oauth.googlecode.com/svn/code/javascript/oauth.js"></script>
<script type="text/javascript" src="http://oauth.googlecode.com/svn/code/javascript/sha1.js"></script>

It’s probably best to download those files to your local server, instead of calling them directly from googlecode in this example.

We then want to use function makeSignedRequest to make the signed URL string to query.yahooapis.com so we can get the YQL results.

<script type="text/javascript">
var makeSignedRequest = function(ck,cks,encodedurl) {     
 
	var accessor = { consumerSecret: cks, tokenSecret: ""};          
	var message = { action: encodedurl, method: "GET", parameters: [["oauth_version","1.0"],["oauth_consumer_key",ck]]};
 
	OAuth.setTimestampAndNonce(message);
	OAuth.SignatureMethod.sign(message, accessor);
 
	var parameterMap = OAuth.getParameterMap(message);
	var baseStr = OAuth.decodeForm(OAuth.SignatureMethod.getBaseString(message));           
	var theSig = "";
 
	if (parameterMap.parameters) {
		for (var item in parameterMap.parameters) {
			for (var subitem in parameterMap.parameters[item]) {
				if (parameterMap.parameters[item][subitem] == "oauth_signature") {
					theSig = parameterMap.parameters[item][1];                    
					break;                      
				}
			}
		}
	}
 
	var paramList = baseStr[2][0].split("&");
	paramList.push("oauth_signature="+ encodeURIComponent(theSig));
	paramList.sort(function(a,b) {
		if (a[0] < b[0]) return -1;
		if (a[0] > b[0]) return 1;
		if (a[1] < b[1]) return  -1;
		if (a[1] > b[1]) return 1;
		return 0;
	});
 
	var locString = "";
	for (var x in paramList) {
		locString += paramList[x] + "&";                
	}
 
	var finalStr = baseStr[1][0] + "?" + locString.slice(0,locString.length - 1);
 
	return finalStr;
};
</script>

Here is an example of how to call the function.

makeSignedRequest("<API Key HERE>","<Shared Secret HERE>","<YQL URL HERE>");

If you havn’t already created your API key and Shared Secret, you can do so by going here. Use the URL from the YQL console, located in the textarea to the right of where you enter the query.

<script type="text/javascript">
var signedURL = makeSignedRequest("dj0yJmk9Rm1MUU9iWmdNZ2FjJmQ9WVdrOVZWWk9Wa3h5TldFbWNHbzlNVEk0TXpNMk1EYzFPQS0tJnM9Y29uc3VtZXJzZWNyZXQmeD1kMg--","570e2ef3db460b114e6a0a987709a0f6a90b5ec0","http://query.yahooapis.com/v1/yql?q=select%20*%20from%20search.news%20where%20query%3D%22obama%22&format=json&callback=myCallback");
</script>

And it’s easy as that! You then can make cool apps that use dynamic script nodes to bring in the YQL data.

Here is a sample app created with all the ingredients from above.

Posted in yahoo | Tagged , , | 9 Comments

visual music.

just in time for Halloween..

Share:
  • E-mail this story to a friend!
  • del.icio.us
  • Yahoo! Buzz
  • TwitThis
  • Digg
  • Facebook
  • DZone
  • Print this article!
Posted in music | Tagged , | Leave a comment

YQL launches!

After many months of work, YQL has finally launched. It’s exciting to see such a powerful tool released that unifies many of Yahoo’s web services and allows mash ups from external sources.

I created the console part of YQL, which is a small part in the YQL picture – but it allows developers to test out  queries quickly without having to setup their environments to debug. OAuth authentication is required to run any YQL query – since the console doesn’t need this, it’s the fastest way to test and create new queries. The console also gives developers sample queries to help them get started and show what data tables are available. YQL results come either in XML, JSON or PHP (if using the PHP SDK) formats.

Tip – You can share YQL queries with your friends like this:

http://developer.yahoo.com/yql/console/?q=<query here>

http://developer.yahoo.com/yql/console/?q=select%20urls.url%20from%20flickr.photos.info%20where%20photo_id%20in%20(select%20id%20from%20flickr.photos.search%20where%20text=%22lolcat%22%20limit%2010)

Jonathan our team architect gave an awesome presentation recently on YQL, please check it out. Props to the rest of the YQL team Nagesh, Josh, Brad and Sam.

Share:
  • E-mail this story to a friend!
  • del.icio.us
  • Yahoo! Buzz
  • TwitThis
  • Digg
  • Facebook
  • DZone
  • Print this article!
Posted in yahoo | Tagged | Leave a comment