{ "metadata": { "name": "", "signature": "sha256:8cbf3426f87c392dad2ecd567667b7d6d391fa84013d59a093a187c82de97c2d" }, "nbformat": 3, "nbformat_minor": 0, "worksheets": [ { "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "# Blaze, MongoDB, and Github Data\n", "\n", "The website http://ghtorrent.org/ maintains a mirror of GitHub's public data in a large Mongo Database.\n", "\n", "* [Gain access](http://ghtorrent.org/raw.html)\n", "* [See available collections](http://ghtorrent.org/mongo.html)\n", "\n", "We access and query this database with Blaze." ] }, { "cell_type": "code", "collapsed": false, "input": [ "import blaze\n", "from blaze import Table, into\n", "blaze.__version__" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 1, "text": [ "'0.6.5'" ] } ], "prompt_number": 1 }, { "cell_type": "markdown", "metadata": {}, "source": [ "We authenticate by tunneling into the server. We previously sent them an `ssh` key.\n", "\n", " ssh -L 27017:dutihr.st.ewi.tudelft.nl:27017 ghtorrent@dutihr.st.ewi.tudelft.nl" ] }, { "cell_type": "code", "collapsed": false, "input": [ "users = Table('mongodb://ghtorrentro:ghtorrentro@localhost/github::users')\n", "users" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
avatar_urlbioblogcompanycreated_atemailfollowersfollowinggravatar_idhireablehtml_urlidlocationloginnamepublic_gistspublic_repostypeurl
0 https://secure.gravatar.com/avatar/a7e55f31bb4... None None None 2012-05-04T13:59:54Z None 0 0 a7e55f31bb45321f30211e901cd89ffa None https://github.com/Michaelwussler 1706010 None Michaelwussler None 0 3 User https://api.github.com/users/Michaelwussler
1 https://secure.gravatar.com/avatar/eb8139078bc... None None None 2012-05-03T18:47:13Z None 0 0 eb8139078bc623dee103ed3917c080dc None https://github.com/praiser 1703505 None praiser None 0 3 User https://api.github.com/users/praiser
2 https://secure.gravatar.com/avatar/13c7b665e0c... None 2010-04-07T12:15:00Z vad.viktor@gmail.com 2 3 13c7b665e0cbd94e0155387c35957d13 False https://github.com/vadviktor 238703 Budapest vadviktor Vad Viktor 0 10 User https://api.github.com/users/vadviktor
3 https://secure.gravatar.com/avatar/b7937805411... None Appcelerator 2012-04-02T16:13:58Z yjin@appcelerator.com 0 0 b7937805411d278ceb839175e251e2a0 False https://github.com/ypjin 1598831 Beijing ypjin Yuping 0 5 User https://api.github.com/users/ypjin
4 https://secure.gravatar.com/avatar/89e109fca84... http://blogs.perl.org/users/steven_haryanto - 2010-02-26T01:28:09Z stevenharyanto@gmail.com 39 307 89e109fca8474e5636c9feef7a8422ea False https://github.com/sharyanto 211084 Jakarta, Indonesia sharyanto Steven Haryanto 5 195 User https://api.github.com/users/sharyanto
5 https://secure.gravatar.com/avatar/7490b4e3e9c... Perl, C, C++, JavaScript, PHP, Haskell, Ruby, ... http://c9s.me 2009-02-01T15:20:08Z cornelius.howl@gmail.com 330 599 7490b4e3e9cb85a1f7dc0c8ea01a86e5 True https://github.com/c9s 50894 Taipei, Taiwan c9s Yo-An Lin 281 206 User https://api.github.com/users/c9s
6 https://secure.gravatar.com/avatar/dc078ac4dbd... None azhari.harahap.us CapungRiders 2010-10-31T05:53:40Z azhari@harahap.us 26 11 dc078ac4dbdc06d3e3c0ec0b6801b53d False https://github.com/back2arie 461397 Indonesia back2arie Azhari Harahap 1 15 User https://api.github.com/users/back2arie
7 https://secure.gravatar.com/avatar/fb844ffed6c... Git Ninja and language-agnostic problem solver... http://dukeleto.pl Leto Labs LLC 2008-10-22T03:02:15Z jonathan@leto.net 175 635 fb844ffed6c5a2e69638627e3b721308 True https://github.com/leto 30298 Portland, OR leto Jonathan \"Duke\" Leto 276 112 User https://api.github.com/users/leto
8 https://secure.gravatar.com/avatar/3843ec7861e... http://alanhaggai.org/ Thought Ripples 2009-01-13T16:25:15Z haggai@cpan.org 46 365 3843ec7861e271e803ea076035d683dd False https://github.com/alanhaggai 46288 IN alanhaggai Alan Haggai Alavi 4 54 User https://api.github.com/users/alanhaggai
9 https://secure.gravatar.com/avatar/f611628c558... None arisdottle.net Team Rooster Pirates 2009-05-12T19:29:09Z amiri@roosterpirates.com 16 87 f611628c5588f7a0a72c65ec1f94dfb8 False https://github.com/amiri 83806 Los Angeles, CA amiri Amiri Barksdale 16 18 User https://api.github.com/users/amiri
10 https://secure.gravatar.com/avatar/c57483c5cfe... None http://www.geekfarm.org/wu/muse/WebHome.html None 2009-02-08T03:28:54Z git-c@geekfarm.org 16 87 c57483c5cfe159b98a6e33ee7e9eec38 False https://github.com/wu 52700 None wu Alex White 0 15 User https://api.github.com/users/wu
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 2, "text": [ " avatar_url \\\n", "0 https://secure.gravatar.com/avatar/a7e55f31bb4... \n", "1 https://secure.gravatar.com/avatar/eb8139078bc... \n", "2 https://secure.gravatar.com/avatar/13c7b665e0c... \n", "3 https://secure.gravatar.com/avatar/b7937805411... \n", "4 https://secure.gravatar.com/avatar/89e109fca84... \n", "5 https://secure.gravatar.com/avatar/7490b4e3e9c... \n", "6 https://secure.gravatar.com/avatar/dc078ac4dbd... \n", "7 https://secure.gravatar.com/avatar/fb844ffed6c... \n", "8 https://secure.gravatar.com/avatar/3843ec7861e... \n", "9 https://secure.gravatar.com/avatar/f611628c558... \n", "10 https://secure.gravatar.com/avatar/c57483c5cfe... \n", "\n", " bio \\\n", "0 None \n", "1 None \n", "2 None \n", "3 \n", "4 \n", "5 Perl, C, C++, JavaScript, PHP, Haskell, Ruby, ... \n", "6 None \n", "7 Git Ninja and language-agnostic problem solver... \n", "8 \n", "9 None \n", "10 None \n", "\n", " blog company \\\n", "0 None None \n", "1 None None \n", "2 \n", "3 None Appcelerator \n", "4 http://blogs.perl.org/users/steven_haryanto - \n", "5 http://c9s.me \n", "6 azhari.harahap.us CapungRiders \n", "7 http://dukeleto.pl Leto Labs LLC \n", "8 http://alanhaggai.org/ Thought Ripples \n", "9 arisdottle.net Team Rooster Pirates \n", "10 http://www.geekfarm.org/wu/muse/WebHome.html None \n", "\n", " created_at email followers following \\\n", "0 2012-05-04T13:59:54Z None 0 0 \n", "1 2012-05-03T18:47:13Z None 0 0 \n", "2 2010-04-07T12:15:00Z vad.viktor@gmail.com 2 3 \n", "3 2012-04-02T16:13:58Z yjin@appcelerator.com 0 0 \n", "4 2010-02-26T01:28:09Z stevenharyanto@gmail.com 39 307 \n", "5 2009-02-01T15:20:08Z cornelius.howl@gmail.com 330 599 \n", "6 2010-10-31T05:53:40Z azhari@harahap.us 26 11 \n", "7 2008-10-22T03:02:15Z jonathan@leto.net 175 635 \n", "8 2009-01-13T16:25:15Z haggai@cpan.org 46 365 \n", "9 2009-05-12T19:29:09Z amiri@roosterpirates.com 16 87 \n", "10 2009-02-08T03:28:54Z git-c@geekfarm.org 16 87 \n", "\n", " gravatar_id hireable \\\n", "0 a7e55f31bb45321f30211e901cd89ffa None \n", "1 eb8139078bc623dee103ed3917c080dc None \n", "2 13c7b665e0cbd94e0155387c35957d13 False \n", "3 b7937805411d278ceb839175e251e2a0 False \n", "4 89e109fca8474e5636c9feef7a8422ea False \n", "5 7490b4e3e9cb85a1f7dc0c8ea01a86e5 True \n", "6 dc078ac4dbdc06d3e3c0ec0b6801b53d False \n", "7 fb844ffed6c5a2e69638627e3b721308 True \n", "8 3843ec7861e271e803ea076035d683dd False \n", "9 f611628c5588f7a0a72c65ec1f94dfb8 False \n", "10 c57483c5cfe159b98a6e33ee7e9eec38 False \n", "\n", " html_url id location \\\n", "0 https://github.com/Michaelwussler 1706010 None \n", "1 https://github.com/praiser 1703505 None \n", "2 https://github.com/vadviktor 238703 Budapest \n", "3 https://github.com/ypjin 1598831 Beijing \n", "4 https://github.com/sharyanto 211084 Jakarta, Indonesia \n", "5 https://github.com/c9s 50894 Taipei, Taiwan \n", "6 https://github.com/back2arie 461397 Indonesia \n", "7 https://github.com/leto 30298 Portland, OR \n", "8 https://github.com/alanhaggai 46288 IN \n", "9 https://github.com/amiri 83806 Los Angeles, CA \n", "10 https://github.com/wu 52700 None \n", "\n", " login name public_gists public_repos type \\\n", "0 Michaelwussler None 0 3 User \n", "1 praiser None 0 3 User \n", "2 vadviktor Vad Viktor 0 10 User \n", "3 ypjin Yuping 0 5 User \n", "4 sharyanto Steven Haryanto 5 195 User \n", "5 c9s Yo-An Lin 281 206 User \n", "6 back2arie Azhari Harahap 1 15 User \n", "7 leto Jonathan \"Duke\" Leto 276 112 User \n", "8 alanhaggai Alan Haggai Alavi 4 54 User \n", "9 amiri Amiri Barksdale 16 18 User \n", "10 wu Alex White 0 15 User \n", "\n", " url \n", "0 https://api.github.com/users/Michaelwussler \n", "1 https://api.github.com/users/praiser \n", "2 https://api.github.com/users/vadviktor \n", "3 https://api.github.com/users/ypjin \n", "4 https://api.github.com/users/sharyanto \n", "5 https://api.github.com/users/c9s \n", "6 https://api.github.com/users/back2arie \n", "7 https://api.github.com/users/leto \n", "8 https://api.github.com/users/alanhaggai \n", "9 https://api.github.com/users/amiri \n", "..." ] } ], "prompt_number": 2 }, { "cell_type": "markdown", "metadata": {}, "source": [ "### It feels interactive\n", "\n", "Because by default we ask only for ten elements the remote database can return and communicate results quickly" ] }, { "cell_type": "code", "collapsed": false, "input": [ "users.company" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
company
0 None
1 None
2
3 Appcelerator
4 -
5
6 CapungRiders
7 Leto Labs LLC
8 Thought Ripples
9 Team Rooster Pirates
10 None
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 3, "text": [ " company\n", "0 None\n", "1 None\n", "2 \n", "3 Appcelerator\n", "4 -\n", "5 \n", "6 CapungRiders\n", "7 Leto Labs LLC\n", "8 Thought Ripples\n", "9 Team Rooster Pirates\n", "..." ] } ], "prompt_number": 3 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## It's also powerful\n", "\n", "This computation takes around twenty seconds to run. That's ok. It's querying a terrabyte-scale dataset several thousand miles away. We're ok with twenty seconds." ] }, { "cell_type": "code", "collapsed": false, "input": [ "users[users.followers > 100][['login', 'followers', 'following', 'blog']]" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
loginfollowersfollowingblog
0 c9s 330 599 http://c9s.me
1 leto 175 635 http://dukeleto.pl
2 bingos 125 277 http://use.perl.org/~bingos/journal/
3 chovy 1056 39044 http://anthony.ettinger.name
4 chapmanb 120 30 http://bcbio.wordpress.com
5 equus12 109 4801 None
6 carljm 177 34 http://www.oddbird.net
7 andrewsmedina 171 295 http://www.andrewsmedina.com
8 jbalogh 172 47 http://jbalogh.me
9 ametaireau 116 57 http://www.notmyidea.org
10 robhudson 239 99 http://rob.cogit8.org/
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 4, "text": [ " login followers following blog\n", "0 c9s 330 599 http://c9s.me\n", "1 leto 175 635 http://dukeleto.pl\n", "2 bingos 125 277 http://use.perl.org/~bingos/journal/\n", "3 chovy 1056 39044 http://anthony.ettinger.name\n", "4 chapmanb 120 30 http://bcbio.wordpress.com\n", "5 equus12 109 4801 None\n", "6 carljm 177 34 http://www.oddbird.net\n", "7 andrewsmedina 171 295 http://www.andrewsmedina.com\n", "8 jbalogh 172 47 http://jbalogh.me\n", "9 ametaireau 116 57 http://www.notmyidea.org\n", "..." ] } ], "prompt_number": 4 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## More tables" ] }, { "cell_type": "code", "collapsed": false, "input": [ "repos = Table('mongodb://ghtorrentro:ghtorrentro@localhost/github::repos')\n", "repos" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
clone_urlcreated_atdescriptionforkforksfull_namegit_urlhas_downloadshas_issueshas_wikihomepagehtml_urlidlanguagemaster_branchmirror_urlnameopen_issuesorganizationownerparentprivatepushed_atsizesourcessh_urlsvn_urlupdated_aturlwatchers
0 https://github.com/Michaelwussler/gittest.git 2012-07-12T10:41:03Z False 1 Michaelwussler/gittest git://github.com/Michaelwussler/gittest.git True True True None https://github.com/Michaelwussler/gittest 5002137 Java master None gittest 0 None {u'url': u'https://api.github.com/users/Michae... None False 2012-07-12T11:40:07Z 164 None git@github.com:Michaelwussler/gittest.git https://github.com/Michaelwussler/gittest 2012-07-12T11:40:07Z https://api.github.com/repos/Michaelwussler/gi... 1
1 https://github.com/sharyanto/perl-Task-BeLike-... 2011-03-16T15:06:38Z Install modules currently used in SHARYANTO's ... False 1 sharyanto/perl-Task-BeLike-SHARYANTO-Devel git://github.com/sharyanto/perl-Task-BeLike-SH... True True True http://search.cpan.org/dist/Task-BeLike-SHARYA... https://github.com/sharyanto/perl-Task-BeLike-... 1487560 Perl master None perl-Task-BeLike-SHARYANTO-Devel 0 None {u'url': u'https://api.github.com/users/sharya... None False 2012-07-12T11:35:03Z 608 None git@github.com:sharyanto/perl-Task-BeLike-SHAR... https://github.com/sharyanto/perl-Task-BeLike-... 2012-07-12T11:35:03Z https://api.github.com/repos/sharyanto/perl-Ta... 1
2 https://github.com/Toolpark/irma.git 2012-03-20T11:31:16Z False 1 Toolpark/irma git://github.com/Toolpark/irma.git True True True https://github.com/Toolpark/irma 3774477 JavaScript master None irma 0 {u'url': u'https://api.github.com/users/Toolpa... {u'url': u'https://api.github.com/users/Toolpa... None False 2012-07-12T11:43:31Z 964 None git@github.com:Toolpark/irma.git https://github.com/Toolpark/irma 2012-07-12T11:43:31Z https://api.github.com/repos/Toolpark/irma 2
3 https://github.com/hirakchatterjee/try_git.git 2012-07-12T11:19:45Z None False 1 hirakchatterjee/try_git git://github.com/hirakchatterjee/try_git.git True True True None https://github.com/hirakchatterjee/try_git 5002444 None master None try_git 0 None {u'url': u'https://api.github.com/users/hirakc... None False 2012-07-12T11:31:50Z 92 None git@github.com:hirakchatterjee/try_git.git https://github.com/hirakchatterjee/try_git 2012-07-12T11:31:50Z https://api.github.com/repos/hirakchatterjee/t... 1
4 https://github.com/anirbansaha/inmobi_general_... 2012-07-10T05:37:49Z inmobi_general_cookbooks False 1 anirbansaha/inmobi_general_cookbooks git://github.com/anirbansaha/inmobi_general_co... True True True None https://github.com/anirbansaha/inmobi_general_... 4969515 Ruby master None inmobi_general_cookbooks 0 None {u'url': u'https://api.github.com/users/anirba... None False 2012-07-12T11:31:44Z 448 None git@github.com:anirbansaha/inmobi_general_cook... https://github.com/anirbansaha/inmobi_general_... 2012-07-12T11:31:44Z https://api.github.com/repos/anirbansaha/inmob... 1
5 https://github.com/mmacedo/myapp.git 2012-07-05T21:09:14Z Just test False 1 mmacedo/myapp git://github.com/mmacedo/myapp.git True False False None https://github.com/mmacedo/myapp 4915307 Ruby master None myapp 0 None {u'url': u'https://api.github.com/users/mmaced... None False 2012-07-12T11:35:33Z 356 None git@github.com:mmacedo/myapp.git https://github.com/mmacedo/myapp 2012-07-12T11:35:33Z https://api.github.com/repos/mmacedo/myapp 1
6 https://github.com/rotschopf/SSE.git 2012-05-18T11:38:07Z False 1 rotschopf/SSE git://github.com/rotschopf/SSE.git True False False None https://github.com/rotschopf/SSE 4368710 VHDL master None SSE 0 None {u'url': u'https://api.github.com/users/rotsch... None False 2012-07-12T11:30:39Z 944 None git@github.com:rotschopf/SSE.git https://github.com/rotschopf/SSE 2012-07-12T11:30:39Z https://api.github.com/repos/rotschopf/SSE 1
7 https://github.com/pokermania/engine.ns.io-cli... 2012-07-05T15:59:51Z True 0 pokermania/engine.ns.io-client git://github.com/pokermania/engine.ns.io-clien... True False True https://github.com/pokermania/engine.ns.io-client 4910102 CoffeeScript master None engine.ns.io-client 0 {u'url': u'https://api.github.com/users/pokerm... {u'url': u'https://api.github.com/users/pokerm... {u'has_wiki': True, u'mirror_url': None, u'upd... False 2012-07-12T11:31:40Z 112 {u'has_wiki': True, u'mirror_url': None, u'upd... git@github.com:pokermania/engine.ns.io-client.git https://github.com/pokermania/engine.ns.io-client 2012-07-12T11:31:41Z https://api.github.com/repos/pokermania/engine... 1
8 https://github.com/trifork/dgws.git 2012-04-12T11:04:29Z False 3 trifork/dgws git://github.com/trifork/dgws.git True True True https://github.com/trifork/dgws 4003806 Java develop None dgws 0 {u'url': u'https://api.github.com/users/trifor... {u'url': u'https://api.github.com/users/trifor... None False 2012-07-12T11:40:57Z 168 None git@github.com:trifork/dgws.git https://github.com/trifork/dgws 2012-07-12T11:40:57Z https://api.github.com/repos/trifork/dgws 4
9 https://github.com/fzoli/MillServer.git 2012-06-27T07:01:42Z False 1 fzoli/MillServer git://github.com/fzoli/MillServer.git True True True None https://github.com/fzoli/MillServer 4805282 Java master None MillServer 0 None {u'url': u'https://api.github.com/users/fzoli'... None False 2012-07-12T11:31:32Z 75760 None git@github.com:fzoli/MillServer.git https://github.com/fzoli/MillServer 2012-07-12T11:31:32Z https://api.github.com/repos/fzoli/MillServer 1
10 https://github.com/gkno/gkno.github.com.git 2012-02-23T21:46:20Z False 2 gkno/gkno.github.com git://github.com/gkno/gkno.github.com.git True True True gkno.github.com https://github.com/gkno/gkno.github.com 3530198 None master None gkno.github.com 1 {u'url': u'https://api.github.com/users/gkno',... {u'url': u'https://api.github.com/users/gkno',... None False 2012-07-12T11:31:33Z 160 None git@github.com:gkno/gkno.github.com.git https://github.com/gkno/gkno.github.com 2012-07-12T11:31:33Z https://api.github.com/repos/gkno/gkno.github.com 2
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 5, "text": [ " clone_url created_at \\\n", "0 https://github.com/Michaelwussler/gittest.git 2012-07-12T10:41:03Z \n", "1 https://github.com/sharyanto/perl-Task-BeLike-... 2011-03-16T15:06:38Z \n", "2 https://github.com/Toolpark/irma.git 2012-03-20T11:31:16Z \n", "3 https://github.com/hirakchatterjee/try_git.git 2012-07-12T11:19:45Z \n", "4 https://github.com/anirbansaha/inmobi_general_... 2012-07-10T05:37:49Z \n", "5 https://github.com/mmacedo/myapp.git 2012-07-05T21:09:14Z \n", "6 https://github.com/rotschopf/SSE.git 2012-05-18T11:38:07Z \n", "7 https://github.com/pokermania/engine.ns.io-cli... 2012-07-05T15:59:51Z \n", "8 https://github.com/trifork/dgws.git 2012-04-12T11:04:29Z \n", "9 https://github.com/fzoli/MillServer.git 2012-06-27T07:01:42Z \n", "10 https://github.com/gkno/gkno.github.com.git 2012-02-23T21:46:20Z \n", "\n", " description fork forks \\\n", "0 False 1 \n", "1 Install modules currently used in SHARYANTO's ... False 1 \n", "2 False 1 \n", "3 None False 1 \n", "4 inmobi_general_cookbooks False 1 \n", "5 Just test False 1 \n", "6 False 1 \n", "7 True 0 \n", "8 False 3 \n", "9 False 1 \n", "10 False 2 \n", "\n", " full_name \\\n", "0 Michaelwussler/gittest \n", "1 sharyanto/perl-Task-BeLike-SHARYANTO-Devel \n", "2 Toolpark/irma \n", "3 hirakchatterjee/try_git \n", "4 anirbansaha/inmobi_general_cookbooks \n", "5 mmacedo/myapp \n", "6 rotschopf/SSE \n", "7 pokermania/engine.ns.io-client \n", "8 trifork/dgws \n", "9 fzoli/MillServer \n", "10 gkno/gkno.github.com \n", "\n", " git_url has_downloads \\\n", "0 git://github.com/Michaelwussler/gittest.git True \n", "1 git://github.com/sharyanto/perl-Task-BeLike-SH... True \n", "2 git://github.com/Toolpark/irma.git True \n", "3 git://github.com/hirakchatterjee/try_git.git True \n", "4 git://github.com/anirbansaha/inmobi_general_co... True \n", "5 git://github.com/mmacedo/myapp.git True \n", "6 git://github.com/rotschopf/SSE.git True \n", "7 git://github.com/pokermania/engine.ns.io-clien... True \n", "8 git://github.com/trifork/dgws.git True \n", "9 git://github.com/fzoli/MillServer.git True \n", "10 git://github.com/gkno/gkno.github.com.git True \n", "\n", " has_issues has_wiki ... \\\n", "0 True True ... \n", "1 True True ... \n", "2 True True ... \n", "3 True True ... \n", "4 True True ... \n", "5 False False ... \n", "6 False False ... \n", "7 False True ... \n", "8 True True ... \n", "9 True True ... \n", "10 True True ... \n", "\n", " parent private \\\n", "0 None False \n", "1 None False \n", "2 None False \n", "3 None False \n", "4 None False \n", "5 None False \n", "6 None False \n", "7 {u'has_wiki': True, u'mirror_url': None, u'upd... False \n", "8 None False \n", "9 None False \n", "10 None False \n", "\n", " pushed_at size \\\n", "0 2012-07-12T11:40:07Z 164 \n", "1 2012-07-12T11:35:03Z 608 \n", "2 2012-07-12T11:43:31Z 964 \n", "3 2012-07-12T11:31:50Z 92 \n", "4 2012-07-12T11:31:44Z 448 \n", "5 2012-07-12T11:35:33Z 356 \n", "6 2012-07-12T11:30:39Z 944 \n", "7 2012-07-12T11:31:40Z 112 \n", "8 2012-07-12T11:40:57Z 168 \n", "9 2012-07-12T11:31:32Z 75760 \n", "10 2012-07-12T11:31:33Z 160 \n", "\n", " source \\\n", "0 None \n", "1 None \n", "2 None \n", "3 None \n", "4 None \n", "5 None \n", "6 None \n", "7 {u'has_wiki': True, u'mirror_url': None, u'upd... \n", "8 None \n", "9 None \n", "10 None \n", "\n", " ssh_url \\\n", "0 git@github.com:Michaelwussler/gittest.git \n", "1 git@github.com:sharyanto/perl-Task-BeLike-SHAR... \n", "2 git@github.com:Toolpark/irma.git \n", "3 git@github.com:hirakchatterjee/try_git.git \n", "4 git@github.com:anirbansaha/inmobi_general_cook... \n", "5 git@github.com:mmacedo/myapp.git \n", "6 git@github.com:rotschopf/SSE.git \n", "7 git@github.com:pokermania/engine.ns.io-client.git \n", "8 git@github.com:trifork/dgws.git \n", "9 git@github.com:fzoli/MillServer.git \n", "10 git@github.com:gkno/gkno.github.com.git \n", "\n", " svn_url updated_at \\\n", "0 https://github.com/Michaelwussler/gittest 2012-07-12T11:40:07Z \n", "1 https://github.com/sharyanto/perl-Task-BeLike-... 2012-07-12T11:35:03Z \n", "2 https://github.com/Toolpark/irma 2012-07-12T11:43:31Z \n", "3 https://github.com/hirakchatterjee/try_git 2012-07-12T11:31:50Z \n", "4 https://github.com/anirbansaha/inmobi_general_... 2012-07-12T11:31:44Z \n", "5 https://github.com/mmacedo/myapp 2012-07-12T11:35:33Z \n", "6 https://github.com/rotschopf/SSE 2012-07-12T11:30:39Z \n", "7 https://github.com/pokermania/engine.ns.io-client 2012-07-12T11:31:41Z \n", "8 https://github.com/trifork/dgws 2012-07-12T11:40:57Z \n", "9 https://github.com/fzoli/MillServer 2012-07-12T11:31:32Z \n", "10 https://github.com/gkno/gkno.github.com 2012-07-12T11:31:33Z \n", "\n", " url watchers \n", "0 https://api.github.com/repos/Michaelwussler/gi... 1 \n", "1 https://api.github.com/repos/sharyanto/perl-Ta... 1 \n", "2 https://api.github.com/repos/Toolpark/irma 2 \n", "3 https://api.github.com/repos/hirakchatterjee/t... 1 \n", "4 https://api.github.com/repos/anirbansaha/inmob... 1 \n", "5 https://api.github.com/repos/mmacedo/myapp 1 \n", "6 https://api.github.com/repos/rotschopf/SSE 1 \n", "7 https://api.github.com/repos/pokermania/engine... 1 \n", "8 https://api.github.com/repos/trifork/dgws 4 \n", "9 https://api.github.com/repos/fzoli/MillServer 1 \n", "10 https://api.github.com/repos/gkno/gkno.github.com 2 \n", "\n", "..." ] } ], "prompt_number": 5 }, { "cell_type": "code", "collapsed": false, "input": [ "issues = Table('mongodb://ghtorrentro:ghtorrentro@localhost/github::issues')\n", "issues" ], "language": "python", "metadata": {}, "outputs": [ { "html": [ "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
assigneebodyclosed_atcommentscomments_urlcreated_atevents_urlhtml_urlidlabelslabels_urlmilestonenumberownerpull_requestrepostatetitleupdated_aturluser
0 None TweetLine is a Sublime Text 2 Plugin to post c... None 0 https://api.github.com/repos/wbond/package_con... 2012-11-20T15:51:49Z https://api.github.com/repos/wbond/package_con... https://github.com/wbond/package_control_chann... 8509346 [] https://api.github.com/repos/wbond/package_con... None 809 wbond {u'diff_url': u'https://github.com/wbond/packa... package_control_channel open Add SublimeTweetLine 2012-11-20T15:52:20Z https://api.github.com/repos/wbond/package_con... {u'following_url': u'https://api.github.com/us...
1 None Submitting a new package named AutoIndent whic... None 0 https://api.github.com/repos/wbond/package_con... 2012-11-20T08:16:05Z https://api.github.com/repos/wbond/package_con... https://github.com/wbond/package_control_chann... 8496155 [] https://api.github.com/repos/wbond/package_con... None 808 wbond {u'diff_url': u'https://github.com/wbond/packa... package_control_channel open Added AutoIndent 2012-11-20T08:16:05Z https://api.github.com/repos/wbond/package_con... {u'following_url': u'https://api.github.com/us...
2 None Adding support for my library of Sublime Text ... None 8 https://api.github.com/repos/wbond/package_con... 2012-11-19T22:47:17Z https://api.github.com/repos/wbond/package_con... https://github.com/wbond/package_control_chann... 8485997 [] https://api.github.com/repos/wbond/package_con... None 806 wbond {u'diff_url': u'https://github.com/wbond/packa... package_control_channel open Adding Dayle Rees Color Schemes 2012-11-20T06:17:02Z https://api.github.com/repos/wbond/package_con... {u'following_url': u'https://api.github.com/us...
3 None Added SuperAnt 2012-10-02T02:34:26Z 0 None 2012-09-28T19:32:40Z None https://github.com/wbond/package_control_chann... 7226975 [] None None 657 wbond {u'diff_url': u'https://github.com/wbond/packa... package_control_channel closed SuperANT 2012-10-02T02:34:26Z https://api.github.com/repos/wbond/package_con... {u'url': u'https://api.github.com/users/aphex'...
4 None See readme for info! None 0 https://api.github.com/repos/wbond/package_con... 2012-11-19T19:27:28Z https://api.github.com/repos/wbond/package_con... https://github.com/wbond/package_control_chann... 8479860 [] https://api.github.com/repos/wbond/package_con... None 805 wbond {u'diff_url': u'https://github.com/wbond/packa... package_control_channel open Adding Expand Selection by Paragraph Plugin 2012-11-19T19:27:28Z https://api.github.com/repos/wbond/package_con... {u'following_url': u'https://api.github.com/us...
5 None Added JavaScript snippets from: https://github... None 0 https://api.github.com/repos/wbond/package_con... 2012-11-19T19:23:11Z https://api.github.com/repos/wbond/package_con... https://github.com/wbond/package_control_chann... 8479724 [] https://api.github.com/repos/wbond/package_con... None 804 wbond {u'diff_url': u'https://github.com/wbond/packa... package_control_channel open Added JavaScript Snippets 2012-11-19T19:23:11Z https://api.github.com/repos/wbond/package_con... {u'following_url': u'https://api.github.com/us...
6 None See [repository on GitHub](https://github.com/... None 0 https://api.github.com/repos/wbond/package_con... 2012-11-19T18:49:52Z https://api.github.com/repos/wbond/package_con... https://github.com/wbond/package_control_chann... 8478712 [] https://api.github.com/repos/wbond/package_con... None 803 wbond {u'diff_url': u'https://github.com/wbond/packa... package_control_channel open Added ParentalControl Package 2012-11-19T18:49:52Z https://api.github.com/repos/wbond/package_con... {u'following_url': u'https://api.github.com/us...
7 None IMESupport is a plugin to fix an issue of Subl... None 0 https://api.github.com/repos/wbond/package_con... 2012-11-19T15:50:14Z https://api.github.com/repos/wbond/package_con... https://github.com/wbond/package_control_chann... 8472990 [] https://api.github.com/repos/wbond/package_con... None 802 wbond {u'diff_url': u'https://github.com/wbond/packa... package_control_channel open Add IMESupport plugin 2012-11-19T15:50:14Z https://api.github.com/repos/wbond/package_con... {u'following_url': u'https://api.github.com/us...
8 None ThemeSelector is a Sublime Text 2 Plugin to se... None 0 https://api.github.com/repos/wbond/package_con... 2012-11-19T14:20:34Z https://api.github.com/repos/wbond/package_con... https://github.com/wbond/package_control_chann... 8470181 [] https://api.github.com/repos/wbond/package_con... None 801 wbond {u'diff_url': u'https://github.com/wbond/packa... package_control_channel open add ThemeSelector 2012-11-19T14:20:34Z https://api.github.com/repos/wbond/package_con... {u'following_url': u'https://api.github.com/us...
9 None None 0 https://api.github.com/repos/wbond/package_con... 2012-11-19T12:31:38Z https://api.github.com/repos/wbond/package_con... https://github.com/wbond/package_control_chann... 8467611 [] https://api.github.com/repos/wbond/package_con... None 800 wbond {u'diff_url': u'https://github.com/wbond/packa... package_control_channel open Added Gauche Syntax 2012-11-19T12:31:38Z https://api.github.com/repos/wbond/package_con... {u'following_url': u'https://api.github.com/us...
10 None None 0 https://api.github.com/repos/wbond/package_con... 2012-11-19T06:40:47Z https://api.github.com/repos/wbond/package_con... https://github.com/wbond/package_control_chann... 8460421 [] https://api.github.com/repos/wbond/package_con... None 799 wbond {u'diff_url': u'https://github.com/wbond/packa... package_control_channel open Adding LettuceFarmer plugin 2012-11-19T06:40:47Z https://api.github.com/repos/wbond/package_con... {u'following_url': u'https://api.github.com/us...
" ], "metadata": {}, "output_type": "pyout", "prompt_number": 6, "text": [ " assignee body \\\n", "0 None TweetLine is a Sublime Text 2 Plugin to post c... \n", "1 None Submitting a new package named AutoIndent whic... \n", "2 None Adding support for my library of Sublime Text ... \n", "3 None Added SuperAnt \n", "4 None See readme for info! \n", "5 None Added JavaScript snippets from: https://github... \n", "6 None See [repository on GitHub](https://github.com/... \n", "7 None IMESupport is a plugin to fix an issue of Subl... \n", "8 None ThemeSelector is a Sublime Text 2 Plugin to se... \n", "9 None \n", "10 None \n", "\n", " closed_at comments \\\n", "0 None 0 \n", "1 None 0 \n", "2 None 8 \n", "3 2012-10-02T02:34:26Z 0 \n", "4 None 0 \n", "5 None 0 \n", "6 None 0 \n", "7 None 0 \n", "8 None 0 \n", "9 None 0 \n", "10 None 0 \n", "\n", " comments_url created_at \\\n", "0 https://api.github.com/repos/wbond/package_con... 2012-11-20T15:51:49Z \n", "1 https://api.github.com/repos/wbond/package_con... 2012-11-20T08:16:05Z \n", "2 https://api.github.com/repos/wbond/package_con... 2012-11-19T22:47:17Z \n", "3 None 2012-09-28T19:32:40Z \n", "4 https://api.github.com/repos/wbond/package_con... 2012-11-19T19:27:28Z \n", "5 https://api.github.com/repos/wbond/package_con... 2012-11-19T19:23:11Z \n", "6 https://api.github.com/repos/wbond/package_con... 2012-11-19T18:49:52Z \n", "7 https://api.github.com/repos/wbond/package_con... 2012-11-19T15:50:14Z \n", "8 https://api.github.com/repos/wbond/package_con... 2012-11-19T14:20:34Z \n", "9 https://api.github.com/repos/wbond/package_con... 2012-11-19T12:31:38Z \n", "10 https://api.github.com/repos/wbond/package_con... 2012-11-19T06:40:47Z \n", "\n", " events_url \\\n", "0 https://api.github.com/repos/wbond/package_con... \n", "1 https://api.github.com/repos/wbond/package_con... \n", "2 https://api.github.com/repos/wbond/package_con... \n", "3 None \n", "4 https://api.github.com/repos/wbond/package_con... \n", "5 https://api.github.com/repos/wbond/package_con... \n", "6 https://api.github.com/repos/wbond/package_con... \n", "7 https://api.github.com/repos/wbond/package_con... \n", "8 https://api.github.com/repos/wbond/package_con... \n", "9 https://api.github.com/repos/wbond/package_con... \n", "10 https://api.github.com/repos/wbond/package_con... \n", "\n", " html_url id labels ... \\\n", "0 https://github.com/wbond/package_control_chann... 8509346 [] ... \n", "1 https://github.com/wbond/package_control_chann... 8496155 [] ... \n", "2 https://github.com/wbond/package_control_chann... 8485997 [] ... \n", "3 https://github.com/wbond/package_control_chann... 7226975 [] ... \n", "4 https://github.com/wbond/package_control_chann... 8479860 [] ... \n", "5 https://github.com/wbond/package_control_chann... 8479724 [] ... \n", "6 https://github.com/wbond/package_control_chann... 8478712 [] ... \n", "7 https://github.com/wbond/package_control_chann... 8472990 [] ... \n", "8 https://github.com/wbond/package_control_chann... 8470181 [] ... \n", "9 https://github.com/wbond/package_control_chann... 8467611 [] ... \n", "10 https://github.com/wbond/package_control_chann... 8460421 [] ... \n", "\n", " milestone number owner pull_request \\\n", "0 None 809 wbond {u'diff_url': u'https://github.com/wbond/packa... \n", "1 None 808 wbond {u'diff_url': u'https://github.com/wbond/packa... \n", "2 None 806 wbond {u'diff_url': u'https://github.com/wbond/packa... \n", "3 None 657 wbond {u'diff_url': u'https://github.com/wbond/packa... \n", "4 None 805 wbond {u'diff_url': u'https://github.com/wbond/packa... \n", "5 None 804 wbond {u'diff_url': u'https://github.com/wbond/packa... \n", "6 None 803 wbond {u'diff_url': u'https://github.com/wbond/packa... \n", "7 None 802 wbond {u'diff_url': u'https://github.com/wbond/packa... \n", "8 None 801 wbond {u'diff_url': u'https://github.com/wbond/packa... \n", "9 None 800 wbond {u'diff_url': u'https://github.com/wbond/packa... \n", "10 None 799 wbond {u'diff_url': u'https://github.com/wbond/packa... \n", "\n", " repo state \\\n", "0 package_control_channel open \n", "1 package_control_channel open \n", "2 package_control_channel open \n", "3 package_control_channel closed \n", "4 package_control_channel open \n", "5 package_control_channel open \n", "6 package_control_channel open \n", "7 package_control_channel open \n", "8 package_control_channel open \n", "9 package_control_channel open \n", "10 package_control_channel open \n", "\n", " title updated_at \\\n", "0 Add SublimeTweetLine 2012-11-20T15:52:20Z \n", "1 Added AutoIndent 2012-11-20T08:16:05Z \n", "2 Adding Dayle Rees Color Schemes 2012-11-20T06:17:02Z \n", "3 SuperANT 2012-10-02T02:34:26Z \n", "4 Adding Expand Selection by Paragraph Plugin 2012-11-19T19:27:28Z \n", "5 Added JavaScript Snippets 2012-11-19T19:23:11Z \n", "6 Added ParentalControl Package 2012-11-19T18:49:52Z \n", "7 Add IMESupport plugin 2012-11-19T15:50:14Z \n", "8 add ThemeSelector 2012-11-19T14:20:34Z \n", "9 Added Gauche Syntax 2012-11-19T12:31:38Z \n", "10 Adding LettuceFarmer plugin 2012-11-19T06:40:47Z \n", "\n", " url \\\n", "0 https://api.github.com/repos/wbond/package_con... \n", "1 https://api.github.com/repos/wbond/package_con... \n", "2 https://api.github.com/repos/wbond/package_con... \n", "3 https://api.github.com/repos/wbond/package_con... \n", "4 https://api.github.com/repos/wbond/package_con... \n", "5 https://api.github.com/repos/wbond/package_con... \n", "6 https://api.github.com/repos/wbond/package_con... \n", "7 https://api.github.com/repos/wbond/package_con... \n", "8 https://api.github.com/repos/wbond/package_con... \n", "9 https://api.github.com/repos/wbond/package_con... \n", "10 https://api.github.com/repos/wbond/package_con... \n", "\n", " user \n", "0 {u'following_url': u'https://api.github.com/us... \n", "1 {u'following_url': u'https://api.github.com/us... \n", "2 {u'following_url': u'https://api.github.com/us... \n", "3 {u'url': u'https://api.github.com/users/aphex'... \n", "4 {u'following_url': u'https://api.github.com/us... \n", "5 {u'following_url': u'https://api.github.com/us... \n", "6 {u'following_url': u'https://api.github.com/us... \n", "7 {u'following_url': u'https://api.github.com/us... \n", "8 {u'following_url': u'https://api.github.com/us... \n", "9 {u'following_url': u'https://api.github.com/us... \n", "10 {u'following_url': u'https://api.github.com/us... \n", "\n", "..." ] } ], "prompt_number": 6 }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Use `into` to bring results home\n", "\n", "What are the open Blaze issues?" ] }, { "cell_type": "code", "collapsed": false, "input": [ "into(list,\n", " issues[(issues.owner == 'ContinuumIO') \n", " & (issues.repo == 'blaze')\n", " & (issues.state == 'open')][['title', 'created_at']])" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 7, "text": [ "[(u\"Blaze docs can't be read on iPhone\", u'2013-06-24T21:33:56Z'),\n", " (u\"Tiny fix about options parsing in the 'chunked dot' bench.\",\n", " u'2013-06-04T22:23:51Z'),\n", " (u'Declaring dependencies', u'2013-04-19T05:59:02Z'),\n", " (u'add a basic mailmap file', u'2013-04-12T18:57:20Z'),\n", " (u'After following install instructions \"blaze\" module won\\'t import',\n", " u'2013-03-22T22:13:22Z'),\n", " (u'Mailing list link on http://blaze.pydata.org/ points to GitHub not Google Groups',\n", " u'2013-03-22T22:05:51Z'),\n", " (u\"Quickstart first example doesn't work as described\",\n", " u'2013-03-21T01:24:53Z'),\n", " (u'Disagreement in size of \"float\" between blaze and numpy',\n", " u'2013-03-16T17:37:33Z'),\n", " (u'blaze.zeros() slowness', u'2013-03-13T16:06:19Z'),\n", " (u'Opening CTable fails', u'2013-03-03T22:22:14Z'),\n", " (u'fromiter silently catches exceptions thrown by generators, generates bad matrices',\n", " u'2013-03-01T17:36:34Z'),\n", " (u'Add complex32 support', u'2013-02-28T18:34:24Z'),\n", " (u'persistence of tables seems to not be working', u'2013-02-27T18:03:07Z'),\n", " (u'warnings when building extensions (at least on mac os x)',\n", " u'2013-02-27T16:47:24Z'),\n", " (u'Example in quick start docs does not work', u'2013-02-21T11:06:06Z'),\n", " (u'Parsing datashapes with \"type Name = ...\" in them returns None',\n", " u'2013-02-19T19:44:40Z'),\n", " (u'Vlen implementation issues on Windows', u'2013-02-14T11:59:17Z'),\n", " (u\"Can't create Tables using RecordDecl per the examples in the docs\",\n", " u'2013-02-01T04:17:56Z'),\n", " (u'many import errors', u'2013-07-18T02:55:58Z'),\n", " (u\" BLZ `format` '' is not supported.\", u'2013-07-31T15:21:21Z'),\n", " (u'Start on expressoin graph', u'2013-08-23T15:01:07Z'),\n", " (u\"Can't create large multidimensional array in BLZ\",\n", " u'2013-09-09T08:38:34Z'),\n", " (u'Blaze Kernels', u'2013-09-06T13:58:04Z'),\n", " (u'Iterators in BLZ should be in their own class', u'2013-09-25T09:35:43Z'),\n", " (u'Missing dynd-python dependency requirement', u'2013-10-03T21:00:37Z'),\n", " (u'outdated website examples?', u'2013-11-04T02:50:19Z'),\n", " (u'Update install doc about the dynd dependency. Closed #79',\n", " u'2013-10-29T20:52:45Z'),\n", " (u'blz storage r/w mode is wrong', u'2013-11-04T02:54:42Z'),\n", " (u'The printing code should support general datashapes',\n", " u'2013-11-28T15:35:01Z'),\n", " (u'Added data descriptors for CSV and JSON files. Storage and Array also support them.',\n", " u'2013-12-10T12:33:08Z'),\n", " (u'a basic catalog', u'2013-12-10T08:06:03Z'),\n", " (u'Documentation', u'2013-12-09T16:21:42Z'),\n", " (u'Open command should be load', u'2013-12-04T20:36:58Z'),\n", " (u\"Array's from iterators don't determing type correctly\",\n", " u'2013-12-03T20:38:29Z'),\n", " (u'Cannot create a blaze array with a Record datashape',\n", " u'2013-12-02T14:03:39Z'),\n", " (u'[WIP] Blaze distributed capabilities', u'2013-12-12T16:43:22Z'),\n", " (u'Shuffle files around', u'2013-12-12T05:44:43Z'),\n", " (u'Do not require uri for local file', u'2013-12-11T22:46:15Z'),\n", " (u'Second round of shuffling', u'2013-12-12T19:25:59Z'),\n", " (u'[WIP] Csv dd cleanup refactor', u'2013-12-18T23:27:39Z'),\n", " (u'Update server code to use catalog', u'2013-12-18T08:02:41Z'),\n", " (u'Data Descriptor cleanup', u'2013-12-17T18:06:25Z'),\n", " (u'Dshape refactor', u'2013-12-19T23:58:34Z'),\n", " (u\"use relative imports for tests and blz_ext's use of bparams\",\n", " u'2013-12-21T22:24:48Z'),\n", " (u'Skipping cffi test on travis', u'2014-01-01T22:11:52Z'),\n", " (u'Convert remote array to a data descriptor', u'2014-01-08T01:05:46Z'),\n", " (u'[WIP] Remove blz', u'2014-01-13T14:43:52Z'),\n", " (u'Consider renaming blaze.drop function', u'2014-01-14T06:19:37Z'),\n", " (u'blaze.array from iterator type deduction, closes issue #86',\n", " u'2014-01-14T02:14:14Z'),\n", " (u'Clean up blaze.array methods/attributes', u'2014-01-14T06:27:11Z'),\n", " (u'Server compute context', u'2014-01-14T22:06:38Z'),\n", " (u'[WIP] Doc tweaks', u'2014-01-14T06:32:32Z'),\n", " (u'Add a caching mechanism to the blaze catalog', u'2014-01-16T23:02:26Z'),\n", " (u'Reform execution pipeline a bit (work towards better integration of new backends',\n", " u'2014-01-20T16:09:14Z'),\n", " (u'[WIP] HDF5 DataDescriptor', u'2014-01-20T12:19:01Z'),\n", " (u'Catalog module requires yaml', u'2014-01-21T10:06:59Z'),\n", " (u'design doc for numpy-like API', u'2014-01-21T08:06:18Z'),\n", " (u'Uniformexecution', u'2014-01-21T20:26:50Z'),\n", " (u'DataDescriptor for HDF5', u'2014-01-22T17:23:53Z'),\n", " (u'Diagnose and fix problem evaluating nested ufunc calls',\n", " u'2014-01-29T20:44:58Z'),\n", " (u'[WIP] Sql', u'2014-01-29T17:35:37Z'),\n", " (u'Adding internal import details', u'2014-01-30T20:01:48Z'),\n", " (u'Add drop to catalog', u'2014-01-31T17:06:36Z'),\n", " (u'[WIP] HDF5 DataDescriptor docs and examples', u'2014-02-03T16:30:06Z'),\n", " (u'Work on allowing creation of stand-alone blaze functions',\n", " u'2014-02-03T14:48:04Z'),\n", " (u'Backend generalization and gentle start on sql backend',\n", " u'2014-02-03T14:47:08Z'),\n", " (u'Syntax for choosing multiple fields', u'2014-02-05T20:21:37Z'),\n", " (u'Work on AIR debug printing in blaze REPL', u'2014-02-05T17:08:04Z'),\n", " (u'`blaze.open()` should try to recognize file extensions in case `format` param is not passed',\n", " u'2014-02-04T16:52:22Z'),\n", " (u'Pyinterp', u'2014-02-12T11:39:15Z'),\n", " (u'Strategy', u'2014-02-11T18:51:20Z'),\n", " (u'[WIP] Work in sql(ite) data descriptor', u'2014-02-11T15:04:35Z'),\n", " (u'[WIP] Initial design for making hdf5 files acting as native catalog dirs',\n", " u'2014-02-06T17:18:48Z'),\n", " (u'Assignation error between blaze array and numpy array',\n", " u'2014-02-17T14:56:44Z'),\n", " (u'Update README.md to fix a broken link', u'2014-02-14T22:09:30Z'),\n", " (u'hdf5 sample broken', u'2014-02-12T16:54:09Z'),\n", " (u'Cannot get string out of blaze array', u'2014-02-18T18:42:51Z'),\n", " (u'Iteration over blaze arrays returns data descriptors',\n", " u'2014-02-18T18:42:02Z'),\n", " (u'Manual CSV delimiter specification needed', u'2014-02-18T17:18:39Z'),\n", " (u'Dependency on pyparsing', u'2014-02-19T18:19:54Z'),\n", " (u'Adding testing driven code documentation', u'2014-02-19T17:55:22Z'),\n", " (u'Blaze sql record field selection is not lazy', u'2014-02-26T17:47:07Z'),\n", " (u'SQL tutorial and column selection', u'2014-02-26T11:01:45Z'),\n", " (u'Sqldocs', u'2014-02-24T17:21:48Z'),\n", " (u'[WIP] Removing some \"import blaze\" statements', u'2014-02-20T17:24:04Z'),\n", " (u'Graphs do not fold constants', u'2014-03-03T22:47:26Z'),\n", " (u'Simple graph', u'2014-03-03T22:27:37Z'),\n", " (u'A proposal for a simple SQL cache for Blaze.', u'2014-03-03T18:11:58Z'),\n", " (u'[WIP] Updates for new datashape grammar', u'2014-02-27T03:49:07Z'),\n", " (u\"blaze server can't handle names with .\", u'2014-03-03T22:53:01Z'),\n", " (u'[WIP] Constant folding', u'2014-03-03T22:48:09Z'),\n", " (u'blaze catalog requires hdf5 file have extension .h5',\n", " u'2014-03-03T22:53:54Z'),\n", " (u'Add hdf5 catalog to server sample', u'2014-03-03T23:17:53Z'),\n", " (u\"Samples and doctest aren't tested with unittests\",\n", " u'2014-03-04T00:13:39Z'),\n", " (u\"Samples and doctest aren't tested with unittests\",\n", " u'2014-03-04T00:26:47Z'),\n", " (u'[WIP] Fix sql printing and selection', u'2014-03-04T00:24:18Z'),\n", " (u'[WIP] Use new datashape overloader, general dispatch cleanup',\n", " u'2014-03-14T01:07:11Z'),\n", " (u'Tweak for overloader PR on datashape', u'2014-03-13T21:26:50Z'),\n", " (u'[WIP] Design Doc Update', u'2014-03-07T08:46:24Z'),\n", " (u'Mode in storage is respected in constructors now. Fixes #83.',\n", " u'2014-03-14T21:03:25Z'),\n", " (u'Indexed assignment does not work', u'2014-03-18T17:37:58Z'),\n", " (u'[WIP] Element wise, chunked evaluator, suited for OOC operations',\n", " u'2014-03-17T15:22:22Z'),\n", " (u'Better error message on getting buffers out of deferred arrays',\n", " u'2014-03-19T12:57:18Z'),\n", " (u'Remove scidb (for now), make default overloading explicit',\n", " u'2014-03-19T00:12:57Z'),\n", " (u'Continuing proposal for a SQL cache for Blaze.', u'2014-03-19T19:59:13Z'),\n", " (u'Build dynd in travis', u'2014-03-20T01:39:59Z'),\n", " (u'Adding dynd install from source', u'2014-03-19T23:19:30Z'),\n", " (u'Blaze SQL Example Fails', u'2014-03-20T14:38:12Z'),\n", " (u'Update requirements use only pip on travis', u'2014-03-20T20:42:43Z'),\n", " (u'SQL catalogue parsing', u'2014-03-20T16:04:32Z'),\n", " (u'[WIP] Add ReductionBlazeFunc and instances using it',\n", " u'2014-03-20T23:50:13Z'),\n", " (u'Assignments of operations in ranges does not work',\n", " u'2014-03-21T10:40:34Z'),\n", " (u'A propsoal for handling SQL queries', u'2014-03-21T04:54:48Z'),\n", " (u'[WIP] A first proposal for a Table object', u'2014-03-22T08:48:24Z'),\n", " (u'[WIP] A first proposal for a Table object', u'2014-03-21T16:27:37Z'),\n", " (u'Finish reduction support', u'2014-03-25T21:15:57Z'),\n", " (u'Adding support for the HDF5 format in Storage class',\n", " u'2014-03-26T16:04:22Z'),\n", " (u'[WIP] datetime design doc', u'2014-03-26T08:13:01Z'),\n", " (u'A design document to convert the DataDescriptor class as first-class citizen',\n", " u'2014-03-27T16:25:28Z'),\n", " (u'Link datashape doc to datashape repo', u'2014-03-28T17:38:40Z'),\n", " (u'[WIP] High level parallel expression graph', u'2014-03-28T14:36:27Z'),\n", " (u'[WIP] datetime implementation', u'2014-03-28T22:48:39Z'),\n", " (u'[WIP] A blaze.where() function for filters for HDF5 and BLZ',\n", " u'2014-04-03T15:10:20Z'),\n", " (u'Array.__iter__ yields either scalars or arrays', u'2014-04-07T20:47:13Z'),\n", " (u'Efficient bulk append for DataDescriptors', u'2014-04-07T21:15:42Z'),\n", " (u'CSV_DDesc tweaks', u'2014-04-07T22:53:06Z'),\n", " (u'iterchunks(blen=None) never set to a default', u'2014-04-07T23:31:29Z'),\n", " (u'CSV_DDesc does not respect its own dialect', u'2014-04-07T23:35:51Z'),\n", " (u'Need datasets for comprehensive test suite', u'2014-04-08T14:06:51Z'),\n", " (u'JSON data descriptor reads everything into memory',\n", " u'2014-04-08T15:55:29Z'),\n", " (u'Structured array printing is verbose', u'2014-04-08T15:59:59Z'),\n", " (u'Python_DataDescriptor', u'2014-04-08T18:29:47Z'),\n", " (u'Replace use of `ddesc_as_py` for testing with `list`',\n", " u'2014-04-08T18:38:06Z'),\n", " (u'Getting element from array yield element not array',\n", " u'2014-04-08T18:50:57Z'),\n", " (u'Add Array methods to match numpy interface', u'2014-04-08T19:00:20Z'),\n", " (u'Blaze.JSON_DDesc not compatible with Pandas.DataFrame.to_json',\n", " u'2014-04-08T22:45:17Z'),\n", " (u'[WIP] - Design - Bulk transfer between Data Descriptors',\n", " u'2014-04-08T22:35:52Z'),\n", " (u'[WIP] Reduction tweaks', u'2014-04-08T19:54:50Z'),\n", " (u'Add validate to public blaze API', u'2014-04-09T16:24:07Z'),\n", " (u'[WIP] - Playing with data descriptors', u'2014-04-09T15:32:24Z'),\n", " (u'Replace Capability class with dictionary', u'2014-04-09T17:54:32Z'),\n", " (u'Intelligent caching', u'2014-04-09T21:55:58Z'),\n", " (u'Changes to DyND interrupt development workflow', u'2014-04-09T22:03:46Z'),\n", " (u'File system meta DataDescriptor', u'2014-04-10T14:24:37Z'),\n", " (u'[WIP] Rolling reduce design doc', u'2014-04-10T21:54:50Z'),\n", " (u'Shorten data descriptor file names', u'2014-04-11T14:42:12Z'),\n", " (u'[WIP] Adding support for the netCDF3/netCDF4 format',\n", " u'2014-04-11T12:21:48Z'),\n", " (u'Depend on SQLAlchemy for SQL code generation', u'2014-04-11T15:01:13Z'),\n", " (u'Validate and into', u'2014-04-16T22:50:36Z'),\n", " (u'New Data layer', u'2014-04-16T01:45:28Z'),\n", " (u'Add an optional, no dependencies configuration to travis',\n", " u'2014-04-15T10:01:36Z'),\n", " (u'Dispatched validate and coerce operations', u'2014-04-15T01:07:33Z'),\n", " (u'Table', u'2014-04-25T01:34:21Z'),\n", " (u'[WIP] allow JSON data descriptor to iterate over series of JSON files',\n", " u'2014-04-21T15:01:36Z'),\n", " (u'blaze/data/{dynd,json}.py hide modules when importing in blaze/data/',\n", " u'2014-05-01T22:44:08Z'),\n", " (u'Encode dates/datetimes in JSON data descriptor', u'2014-05-01T21:31:16Z'),\n", " (u'Table Reductions', u'2014-05-07T17:18:49Z'),\n", " (u'[WIP] Initial version of HDFS support via context manager',\n", " u'2014-05-05T01:57:46Z'),\n", " (u'Python join', u'2014-05-15T20:18:18Z'),\n", " (u'Various fixes to Table expressions', u'2014-05-15T17:10:42Z'),\n", " (u'[WIP] Compute layer operations on pyspark RDDs', u'2014-05-15T15:56:26Z'),\n", " (u'Depend on PyToolz', u'2014-05-14T20:29:57Z'),\n", " (u'Support Datetime in HDF5', u'2014-05-12T20:27:30Z'),\n", " (u'Support variable length strings in HDF5', u'2014-05-12T20:23:05Z'),\n", " (u'Datashape discovery', u'2014-05-08T19:57:26Z'),\n", " (u'Add simple static check on expr', u'2014-05-20T16:16:12Z'),\n", " (u'Various fixes, often in SQL', u'2014-05-19T23:07:28Z'),\n", " (u'Dangling file descriptors in `blaze.data.{csv,json}`',\n", " u'2014-05-20T20:36:15Z'),\n", " (u'Apply and Map generic functions onto TableExprs', u'2014-05-22T02:15:00Z'),\n", " (u'Scalar Expressions', u'2014-05-22T01:19:16Z'),\n", " (u'[WIP] Pyspark', u'2014-05-21T22:54:20Z'),\n", " (u'Add nunique operation', u'2014-05-21T18:30:40Z'),\n", " (u'Implicit Joins', u'2014-05-22T21:01:04Z'),\n", " (u'Booleans', u'2014-05-22T18:44:38Z'),\n", " (u'Use Blaze to benchmark various backends', u'2014-05-22T16:47:13Z'),\n", " (u'DyND OOC Backend', u'2014-05-22T16:41:31Z'),\n", " (u'Trivial demonstration development environment ', u'2014-05-22T16:32:04Z'),\n", " (u'Add timezone support to the datetime type', u'2014-05-23T23:45:06Z'),\n", " (u'DyND compute frontend', u'2014-05-23T23:43:30Z'),\n", " (u'Missing data support in DyND', u'2014-05-23T21:57:10Z'),\n", " (u'Jaccard similarity demo', u'2014-05-23T18:28:46Z'),\n", " (u'Label', u'2014-05-23T18:15:48Z'),\n", " (u'Serialization issues with `compute`', u'2014-05-23T14:43:47Z'),\n", " (u'Merge Reorg', u'2014-05-26T17:57:37Z'),\n", " (u'Arbitrary functions', u'2014-05-26T17:45:35Z'),\n", " (u'Blaze Table Object', u'2014-05-26T21:31:19Z'),\n", " (u'Add new quickstart ', u'2014-05-26T21:32:29Z'),\n", " (u'Development blaze on Binstar', u'2014-05-26T22:09:34Z'),\n", " (u'Update Catalog Server', u'2014-05-26T22:25:14Z'),\n", " (u'[WIP] Distinct', u'2014-05-27T15:23:17Z'),\n", " (u'Create `Distinct` term', u'2014-05-27T14:33:45Z'),\n", " (u'Clean up import *', u'2014-05-28T15:53:01Z'),\n", " (u'Jaccard2', u'2014-05-28T19:45:11Z'),\n", " (u'Datashape Discovery', u'2014-05-27T01:01:19Z'),\n", " (u'Impala Backend', u'2014-05-30T16:28:12Z'),\n", " (u'Spark stand-alone mode', u'2014-05-29T15:41:54Z'),\n", " (u'Add compute(Expr, DataDescriptor) implementation',\n", " u'2014-05-30T17:15:17Z'),\n", " (u'merge twitter dataset1 with WDC data', u'2014-05-30T21:08:35Z'),\n", " (u'Python multicolumn groupby', u'2014-05-30T22:41:24Z'),\n", " (u'Spark compute', u'2014-06-06T15:53:38Z'),\n", " (u'Spark', u'2014-06-06T14:54:43Z'),\n", " (u'Scalar Expressions', u'2014-06-06T19:53:30Z'),\n", " (u'Test unicode string support in `blaze.data`', u'2014-06-09T21:45:33Z'),\n", " (u'Delete old Vagrant code, favor conda', u'2014-06-09T22:02:42Z'),\n", " (u'Put `spark` on binstar', u'2014-06-09T23:00:52Z'),\n", " (u'Stress test datashape discovery', u'2014-06-09T22:37:23Z'),\n", " (u'Tune Python Streaming backend', u'2014-06-09T22:36:35Z'),\n", " (u'Fill out Spark implementation', u'2014-06-08T17:28:28Z'),\n", " (u'Jaccard fix', u'2014-06-11T00:33:46Z'),\n", " (u'Vagrant del', u'2014-06-10T22:31:44Z'),\n", " (u'Update documentation for reorg branch', u'2014-06-11T17:42:01Z'),\n", " (u\"Multi-input compute doesn't play well with consumable data sources \",\n", " u'2014-06-11T17:55:38Z'),\n", " (u'SciPy 2014 Paper', u'2014-06-11T20:36:46Z'),\n", " (u'[WIP] Reorg Docs', u'2014-06-11T21:23:34Z'),\n", " (u'Blaze server', u'2014-06-16T23:01:29Z'),\n", " (u'Add `into` operation to api', u'2014-06-16T17:33:45Z'),\n", " (u'Interactive Table object', u'2014-06-18T19:35:02Z'),\n", " (u'data: CSV supports sep as alias for delimiter', u'2014-06-18T22:15:10Z'),\n", " (u'projection of filter TableExpr fails on Spark RDDs',\n", " u'2014-06-19T02:18:11Z'),\n", " (u'Delete old stuff', u'2014-06-20T14:05:45Z'),\n", " (u'Structured description of data descriptors', u'2014-06-20T21:44:42Z'),\n", " (u'Various Small fixes', u'2014-06-20T19:24:32Z'),\n", " (u'Fixup quickstart', u'2014-06-23T14:55:26Z'),\n", " (u'Merge reorg', u'2014-06-23T14:46:09Z'),\n", " (u'Imports', u'2014-06-23T19:06:52Z'),\n", " (u'Efficient CSV -> SQL migration', u'2014-06-24T21:40:50Z'),\n", " (u'Multi column join', u'2014-06-24T15:53:10Z'),\n", " (u\"SQL extend doesn't preserve schema\", u'2014-06-26T15:01:46Z'),\n", " (u'Better csv unicode support with `unicodecsv`', u'2014-06-26T17:07:24Z'),\n", " (u'Small fixes', u'2014-06-26T18:37:07Z'),\n", " (u'[WIP] - HDF5 variable length strings', u'2014-06-26T18:13:36Z'),\n", " (u'Scalar coercion - Server selection', u'2014-06-26T00:37:43Z'),\n", " (u'compute on HDF5 with PyTables', u'2014-06-28T17:45:46Z'),\n", " (u'Sample operation', u'2014-06-30T21:25:16Z'),\n", " (u'unicodecsv is slow', u'2014-07-01T15:27:26Z'),\n", " (u'Coerce works on Spark RDDs', u'2014-07-03T01:05:34Z'),\n", " (u'Extend Projection operation to data descriptors', u'2014-07-03T15:32:52Z'),\n", " (u'expr: Join automatically selects all shared columns',\n", " u'2014-07-03T18:48:18Z'),\n", " (u'Skip gzip csv tests on windows py2.x', u'2014-07-03T18:13:34Z'),\n", " (u'Expression Optimization', u'2014-07-03T22:06:57Z'),\n", " (u'INTO feature for CSV to DB', u'2014-07-09T16:41:55Z'),\n", " (u'Fix repr when Table is backed by mutable data', u'2014-07-11T18:49:15Z'),\n", " (u\"setup.py doesn't include unicodecsv\", u'2014-07-12T15:01:55Z'),\n", " (u'Added unicde to requirements and docs closes #378',\n", " u'2014-07-12T15:27:28Z'),\n", " (u'Various fixes', u'2014-07-07T14:21:23Z'),\n", " (u'Integer column names not working', u'2014-07-12T17:39:06Z'),\n", " (u\"Conda install doesn't install toolz dependency\", u'2014-07-12T23:29:55Z'),\n", " (u\"spark tests aren't skipped when spark isn't installed\",\n", " u'2014-07-13T01:49:52Z'),\n", " (u\"Don't run spark tests if pyspark isn't available\",\n", " u'2014-07-13T11:43:39Z'),\n", " (u'Assist Spark users in parsing CSV files', u'2014-07-02T21:23:25Z'),\n", " (u'Refactor recursion out of compute ', u'2014-07-02T14:41:59Z'),\n", " (u'Multiprocessing meta-backend', u'2014-07-01T16:46:31Z'),\n", " (u'Data descriptor constructors should specify missing values',\n", " u'2014-07-14T13:43:07Z'),\n", " (u'CSV header handling', u'2014-07-14T13:47:13Z'),\n", " (u'`data.py[...]` should avoid returning an iterator when data is small',\n", " u'2014-07-14T13:48:45Z'),\n", " (u'Improve error message for DataDescriptor.__len__',\n", " u'2014-07-14T13:50:05Z'),\n", " (u'Put docs on readthedocs', u'2014-07-14T13:50:53Z'),\n", " (u'Handle missing data in SQL data descriptor', u'2014-07-14T13:55:50Z'),\n", " (u'`rpy2` integration', u'2014-07-14T14:15:42Z'),\n", " (u'server expression security improvements', u'2014-07-14T19:00:13Z'),\n", " (u'support for Map of Columnwise', u'2014-07-14T20:47:52Z'),\n", " (u'Travis conda', u'2014-07-14T15:54:27Z'),\n", " (u'Reduction dshape and csv missing values', u'2014-07-16T15:45:56Z'),\n", " (u'Outer join', u'2014-07-16T21:28:54Z'),\n", " (u'Access columns as attributes, rather than with strings',\n", " u'2014-07-17T00:41:16Z'),\n", " (u'Add `into` implementations for TableExprs', u'2014-07-17T01:21:50Z'),\n", " (u'Add to `into`', u'2014-07-17T01:58:00Z'),\n", " (u'Implement __getattr__', u'2014-07-17T03:39:37Z'),\n", " (u'Consider using setuptools to install instead of distutils',\n", " u'2014-07-17T13:53:32Z'),\n", " (u'How to handle missing values in HDF5?', u'2014-07-17T15:56:05Z'),\n", " (u'Dependency list is incomplete and contradictory', u'2014-07-17T18:33:38Z'),\n", " (u'CSV keyword arguments documentation', u'2014-07-17T19:11:41Z'),\n", " (u'compute: projection of data descriptor uses `.py`',\n", " u'2014-07-17T19:43:51Z'),\n", " (u'CSV: errors and encoding arguments', u'2014-07-17T20:58:54Z'),\n", " (u'SQL databases match nullability to datashape.Option',\n", " u'2014-07-17T22:21:04Z'),\n", " (u'Broken links in \\'blaze.pydata.org\"', u'2014-07-17T22:29:34Z'),\n", " (u'Fixed two links and added google analytics tracking',\n", " u'2014-07-17T23:20:12Z'),\n", " (u'Implement a scalar expression parser', u'2014-07-17T23:03:24Z'),\n", " (u'Add to the CSV docstring', u'2014-07-17T22:54:45Z'),\n", " (u'Cleanup Scalar a bit', u'2014-07-18T12:48:30Z'),\n", " (u'By of merged columns has stopped working.', u'2014-07-18T18:27:18Z'),\n", " (u'Import of blaze.expr.scalar.* breaks merge', u'2014-07-18T18:31:31Z'),\n", " (u'Dev install instructions', u'2014-07-19T16:20:47Z'),\n", " (u'10 minutes to Blaze', u'2014-07-19T22:41:00Z'),\n", " (u'python: by maps call to compute onto child', u'2014-07-20T16:22:49Z'),\n", " (u'Add funders to webpage', u'2014-07-21T22:02:30Z'),\n", " (u'Selection for Date columns in SQL backend produces odd expression',\n", " u'2014-07-22T17:58:56Z'),\n", " (u'Support some NoSQL Database', u'2014-07-23T00:45:17Z'),\n", " (u'flatMapValue PySpark equivalent in Blaze', u'2014-07-24T15:35:39Z'),\n", " (u'Individual columns should be able to repr if not passed in CSV',\n", " u'2014-07-27T17:43:40Z'),\n", " (u'Raise when Table has a different schema than the underlying data',\n", " u'2014-07-27T18:33:43Z'),\n", " (u'Add google analytics to docs', u'2014-07-22T13:25:47Z'),\n", " (u'Fix double return', u'2014-07-28T18:40:40Z'),\n", " (u'Relax constraint that `By` must use reductions', u'2014-07-29T22:20:14Z'),\n", " (u'BColz', u'2014-07-31T03:17:53Z'),\n", " (u\"`count` operation doesn't consider missing values\",\n", " u'2014-07-31T17:13:00Z'),\n", " (u\"expr: selection doesn't fail on non-rowwise child\",\n", " u'2014-08-05T14:34:38Z'),\n", " (u'Visualize the capabilities of each backend', u'2014-08-05T20:28:24Z'),\n", " (u'bcolz, blz, and chunks', u'2014-08-05T19:56:38Z'),\n", " (u'Doc refresh', u'2014-08-05T22:36:08Z'),\n", " (u'[WIP] Bcolz copy', u'2014-08-05T16:13:18Z'),\n", " (u'MongoDB Backend', u'2014-08-01T22:32:16Z'),\n", " (u'PyTables computational backend', u'2014-07-30T19:35:04Z'),\n", " (u'Build Blaze on jenkins, upload to binstar blaze-dev account',\n", " u'2014-08-07T16:16:02Z'),\n", " (u'Make blaze.test() return True or False', u'2014-08-08T20:33:29Z'),\n", " (u'documentation link in the README is broken', u'2014-08-09T00:02:22Z'),\n", " (u'Consistent column naming scheme', u'2014-08-12T15:18:24Z'),\n", " (u'Spark by should use reduceby or foldby', u'2014-08-12T18:44:53Z'),\n", " (u'Comprehensive test suite for `into`', u'2014-08-12T18:58:19Z'),\n", " (u'Parallel chunking or streaming backend', u'2014-08-12T19:04:16Z'),\n", " (u'Into test', u'2014-08-12T21:12:49Z'),\n", " (u'pandas: enforce expression column names on `by`', u'2014-08-12T15:32:07Z'),\n", " (u'Add BColz and chunking backend', u'2014-08-12T02:52:50Z'),\n", " (u'Graceful handling of empty results', u'2014-08-13T15:40:37Z'),\n", " (u'into: test foo <- CSV', u'2014-08-13T15:18:20Z'),\n", " (u'SQL Table Overwrite', u'2014-08-14T04:01:10Z'),\n", " (u\"'by' of pandas DataFrame doesn't work as expected\",\n", " u'2014-08-14T13:46:25Z'),\n", " (u'add into(DataFrame, pytables Table)', u'2014-08-11T21:21:07Z'),\n", " (u'[WIP] Feature/csv to sql natively', u'2014-08-10T02:36:58Z'),\n", " (u'Maintain length in table expressions', u'2014-08-15T01:44:20Z'),\n", " (u'`from blaze import *` results in override of built-ins ',\n", " u'2014-08-15T14:05:14Z'),\n", " (u'dispatch on mathematical functions', u'2014-08-15T15:17:46Z'),\n", " (u'Open world assumption and 3VL in Blaze', u'2014-08-15T17:14:21Z'),\n", " (u'Compute on scalar expressions', u'2014-08-15T19:39:11Z'),\n", " (u'Overload `__len__` to work on Table Expressions and on Table interactive objects',\n", " u'2014-08-15T18:35:50Z'),\n", " (u'Which packages should be required for blaze, which should be optional?',\n", " u'2014-08-15T22:27:59Z'),\n", " (u'Look towards dplyr for ideas to expand expression input',\n", " u'2014-08-16T13:43:04Z'),\n", " (u'ETL on bad CSV data - what should we do?', u'2014-08-18T16:05:15Z'),\n", " (u'[WIP] - Summary', u'2014-08-18T17:30:20Z'),\n", " (u'Compute on scalar expressions', u'2014-08-18T17:23:05Z'),\n", " (u'comprehensive compute tests', u'2014-08-16T17:57:20Z'),\n", " (u\"[WIP] csv_into (don't merge)\", u'2014-08-18T20:25:04Z'),\n", " (u'Release Blogpost', u'2014-08-19T15:29:15Z'),\n", " (u'General function expressions', u'2014-08-19T18:51:04Z'),\n", " (u\"Don't use eval when evaluating RealMath subclasses\",\n", " u'2014-08-19T20:11:47Z'),\n", " (u'drop and create_index dispatched functions', u'2014-08-20T14:56:29Z'),\n", " (u'Does not list pymongo as a requirement', u'2014-08-20T16:37:23Z'),\n", " (u'Small fixes 3', u'2014-08-20T18:32:03Z'),\n", " (u'WIP: compute on HDF5 with PyTables', u'2014-08-19T21:48:48Z'),\n", " (u'Add persistent storage systems to into comprehensive test',\n", " u'2014-08-20T18:37:17Z'),\n", " (u'SQLAlchemy string types - encoding and fixed lengths',\n", " u'2014-08-20T21:27:43Z'),\n", " (u'[WIP] - `dplyr` interface`', u'2014-08-19T15:45:09Z'),\n", " (u'Bug: columns attribute of TableSymbol is None when creating a schema with discover(tables.Table)',\n", " u'2014-08-21T14:34:33Z'),\n", " (u'WIP: Add create_index / drop_index functionality',\n", " u'2014-08-21T20:32:18Z'),\n", " (u'into implementation for SQL using CSV loading', u'2014-08-22T17:15:25Z'),\n", " (u'WIP: RethinkDB for blaze', u'2014-08-23T04:47:39Z'),\n", " (u'Added example rpy2 conversion', u'2014-08-23T01:32:23Z'),\n", " (u'update readme and docs with api changes', u'2014-08-22T18:16:26Z'),\n", " (u'Refactor Chunks', u'2014-08-22T17:27:13Z'),\n", " (u'Continue to test and improve `into`', u'2014-08-22T16:21:23Z'),\n", " (u'[WIP] into(pytables Table, csv) with option to ignore errors in CSV files',\n", " u'2014-08-22T04:35:34Z'),\n", " (u'GZipped CSV <- SQL with new migration system', u'2014-08-23T20:49:44Z'),\n", " (u'Remove old core directory', u'2014-08-23T16:34:17Z'),\n", " (u'Lightweight descriptor for various file formats like Excel, SPSS',\n", " u'2014-08-24T00:25:41Z'),\n", " (u'Moar CSV fixes!', u'2014-09-19T23:07:44Z'),\n", " (u'Parse datetimes in CSV.reader', u'2014-09-17T23:26:43Z'),\n", " (u'Move datetime logic from into to csv.reader', u'2014-09-17T18:40:53Z'),\n", " (u'[WIP] Test table coverage', u'2014-09-17T13:43:57Z'),\n", " (u'Chunked into', u'2014-09-17T12:43:31Z'),\n", " (u'API for moving between type systems', u'2014-09-17T00:58:07Z'),\n", " (u'A Blaze equivalent for: SELECT * WHERE t.column IN list_values',\n", " u'2014-09-16T20:46:45Z'),\n", " (u\"Error in into(DataFrame, '/*.%s.gz' % dataset) that used to work\",\n", " u'2014-09-16T06:51:06Z'),\n", " (u'How should we handle pulling strings out of HDF5?',\n", " u'2014-09-13T21:40:59Z'),\n", " (u'Ideas to clean codebase', u'2014-09-12T17:27:04Z'),\n", " (u'Adding examples, datasets for examples, and .coveragerc ignores.',\n", " u'2014-09-11T18:57:29Z'),\n", " (u'SparkSQL HiveQL', u'2014-09-10T23:18:29Z'),\n", " (u'SparkSQL map', u'2014-09-10T23:17:20Z'),\n", " (u'Google BigQuery', u'2014-09-09T19:23:15Z'),\n", " (u'Google Spreadsheet Table', u'2014-09-09T18:29:54Z'),\n", " (u'[WIP] Handle sqlite INTO call on Windows', u'2014-09-08T21:59:39Z'),\n", " (u'Consider using this tox setup for testing HDFS related work',\n", " u'2014-09-08T17:03:29Z'),\n", " (u'NetCDF4 Backend', u'2014-09-08T15:44:04Z'),\n", " (u\"Investigate use of SQLAlchemy's ORM system for sql generation\",\n", " u'2014-09-07T19:42:53Z'),\n", " (u'[RFC] - Arrays', u'2014-09-07T01:23:53Z'),\n", " (u'xfail on sqlite3 command not available on windows',\n", " u'2014-09-07T00:00:21Z'),\n", " (u'Prevent coveralls from commenting', u'2014-09-06T20:24:14Z'),\n", " (u'CSV Headers with Spark', u'2014-09-04T20:36:02Z'),\n", " (u'#362 Sample operation initial work', u'2014-09-04T19:07:41Z'),\n", " (u'General Performance Guideline: Backend comparison',\n", " u'2014-09-04T18:45:26Z'),\n", " (u'Increase testing coverage', u'2014-09-04T17:20:26Z'),\n", " (u'into(Spark/HDFS, SQL DBs)', u'2014-09-03T23:06:30Z'),\n", " (u'Creating a `test_compute_exhaustive.py`', u'2014-09-03T18:57:15Z'),\n", " (u'Add developer docs on how to build a new Expression type',\n", " u'2014-09-03T17:41:11Z'),\n", " (u'Add more usage examples to the docs', u'2014-09-02T22:19:54Z'),\n", " (u'String matching operation', u'2014-09-02T21:34:51Z'),\n", " (u\"str(Table.count()) doesn't show count\", u'2014-09-02T21:03:51Z'),\n", " (u\"SQL <- CSV doesn't work properly with sqlite\", u'2014-09-02T16:39:50Z'),\n", " (u'rollapply/rolling/window operation', u'2014-09-02T15:07:15Z'),\n", " (u'Create frontend to match LINQ syntax', u'2014-09-01T19:49:36Z'),\n", " (u'Support frequent releases', u'2014-09-01T16:26:29Z'),\n", " (u'Display expr information', u'2014-08-30T19:04:29Z'),\n", " (u'Compute pool with timeouts for Server', u'2014-08-30T17:44:40Z'),\n", " (u'PyCon Submission', u'2014-08-28T20:57:25Z'),\n", " (u'Datetime support (and more robust support in general) in PyTables',\n", " u'2014-08-28T14:20:27Z'),\n", " (u'Use COPY function from Psycopg', u'2014-08-28T03:01:19Z'),\n", " (u'Submit paper for PyHPC 2014', u'2014-08-26T16:40:43Z'),\n", " (u'Discussion of how to support a large number of backends',\n", " u'2014-08-26T15:04:21Z'),\n", " (u'Improve internal documentation/scripts to update documentation',\n", " u'2014-08-25T16:38:13Z'),\n", " (u'Change scalar_symbol into expr', u'2014-09-26T03:58:53Z'),\n", " (u'[WIP] SciDB backend', u'2014-09-26T01:37:30Z'),\n", " (u'Pytables column head', u'2014-09-26T01:32:40Z'),\n", " (u'string operations', u'2014-09-25T23:49:50Z'),\n", " (u'WIP: Implement ColumnWise for MongoDB', u'2014-09-25T23:11:35Z'),\n", " (u'HBase', u'2014-09-25T22:32:52Z'),\n", " (u'Update server design doc', u'2014-09-25T21:25:46Z'),\n", " (u'Allow ignoring particular exceptions when using glob resources',\n", " u'2014-09-25T20:15:22Z'),\n", " (u\"Blaze channel blaze install doesn't include dependencies\",\n", " u'2014-09-24T13:47:07Z'),\n", " (u'Rename Like, Regex to TextLike, TextRegex', u'2014-09-24T12:13:43Z'),\n", " (u'Insert projections opportunistically into expressions',\n", " u'2014-09-24T01:29:40Z'),\n", " (u'WIP: Fix mysql into', u'2014-09-24T01:14:05Z'),\n", " (u'Attribute expressions', u'2014-09-22T17:21:13Z'),\n", " (u'Support datetime attributes ', u'2014-09-22T13:07:39Z'),\n", " (u'Support hive, presto through pyhive project', u'2014-09-21T19:42:37Z'),\n", " (u'Rename `*_index` with `index_*`', u'2014-09-21T19:32:28Z'),\n", " (u'API: somethoughts / ideas', u'2014-10-01T19:33:49Z'),\n", " (u'Break isnull type operations out into a separate expression',\n", " u'2014-09-30T22:01:53Z'),\n", " (u'API: PyTables/Pandas/HDF5', u'2014-09-30T17:01:41Z'),\n", " (u'Required kwargs for certain dispatched functions.',\n", " u'2014-09-30T16:36:31Z'),\n", " (u'BUG,DOC: \"Examples\" link 404', u'2014-09-30T15:02:59Z'),\n", " (u'Misleading error message when building table', u'2014-09-29T21:20:04Z'),\n", " (u'[WIP] Test table api', u'2014-10-02T14:21:57Z'),\n", " (u'into SQL <- CSV sends header as data', u'2014-09-29T13:47:30Z'),\n", " (u'[WIP] Refactor Expr', u'2014-09-27T02:26:13Z'),\n", " (u'Accept dot-delimited schemaname.tablename ', u'2014-09-27T02:22:32Z'),\n", " (u'API: print/show_backends', u'2014-09-26T15:31:01Z'),\n", " (u'Use dir and getattr to dispatch methods based on datashape',\n", " u'2014-09-26T12:18:41Z'),\n", " (u'SQL <- CSV loading errors pop up inappropriately',\n", " u'2014-10-03T20:15:41Z'),\n", " (u'Docs: Include MongoDB examples/docstrings in the website',\n", " u'2014-10-03T20:30:11Z'),\n", " (u'How should we handle user facing warnings?', u'2014-10-03T22:31:52Z'),\n", " (u'Problem converting expression Column to nd.array',\n", " u'2014-10-04T19:12:52Z'),\n", " (u'Datetime access in SQL databases', u'2014-10-06T01:52:39Z'),\n", " (u'More datetime access expressions', u'2014-10-06T01:53:17Z'),\n", " (u'Nested behavior in MongoDB', u'2014-10-06T01:56:14Z'),\n", " (u'Nested behavior in Python, Spark', u'2014-10-06T01:57:01Z'),\n", " (u'Resource for MongoDB connection string', u'2014-10-06T16:32:57Z'),\n", " (u'Handle Gzip complexity in csvopen', u'2014-10-07T15:30:31Z'),\n", " (u'Various fixes 4', u'2014-10-06T15:47:15Z'),\n", " (u'CI: use appveyor to build for windows', u'2014-10-07T16:32:49Z')]" ] } ], "prompt_number": 7 }, { "cell_type": "markdown", "metadata": {}, "source": [ "Replace `list` with `DataFrame`, `np.ndarray`, or a filename in your favorite format to store results in different systems.\n", " " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Inspect Generated Mongo Queries\n", "\n", "Mongo uses a JSON query langauge. Lets inspect these queries rather than executing them. \n", "\n", "*This uses the internal API*" ] }, { "cell_type": "code", "collapsed": false, "input": [ "from blaze import compute, TableExpr, dispatch\n", "from blaze.compute.mongo import MongoQuery\n", "@dispatch(TableExpr, MongoQuery, dict)\n", "def post_compute(expr, q, d):\n", " # Used to communicate to server\n", " # Now just return query\n", " return q.query" ], "language": "python", "metadata": {}, "outputs": [], "prompt_number": 8 }, { "cell_type": "code", "collapsed": false, "input": [ "compute(users[users.followers > 100][['login', 'followers', 'following', 'blog']].head(10))" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 9, "text": [ "({'$match': {'followers': {'$gt': 100}}},\n", " {'$project': {'blog': 1, 'followers': 1, 'following': 1, 'login': 1}},\n", " {'$limit': 10})" ] } ], "prompt_number": 9 }, { "cell_type": "code", "collapsed": false, "input": [ "compute(users.location.count_values().head(10))" ], "language": "python", "metadata": {}, "outputs": [ { "metadata": {}, "output_type": "pyout", "prompt_number": 10, "text": [ "({'$project': {'location': 1}},\n", " {'$group': {'_id': {'location': '$location'}, 'count': {'$sum': 1}}},\n", " {'$project': {'count': '$count', 'location': '$_id.location'}},\n", " {'$sort': {'count': -1}},\n", " {'$limit': 10})" ] } ], "prompt_number": 10 } ], "metadata": {} } ] }