Tweepository
The Tweepository plugin enables the repository to harvest a stream of tweets from a twitter search.
Contents
Installation Prerequisites
The following perl libraries must be installed on the server before the Bazaar package will function.
Date::Calc Date::Parse Encode HTML::Entities JSON LWP::UserAgent URI URI::Find
Installation
Install through the EPrints Bazaar
Using
To create a new tweetstream, click on 'Manage Records', then on 'Twitter Feed', and then on the 'Create new Item' button. A new tweetstream object will be created, and you will need to enter two parameters:
- Search String: Passed directly to Twitter as the search parameter.
- Expiry Date: The date on which to stop harvesting this stream.
Once these fields have been completed, click 'Save and Return'.
Harvesting
At 32 minutes past each hour, the tweepository package will harvest each stream. No harvesting is done on creation, to the tweetstream will initially be empty. Tweets will be processed to:
- extract hashtags
- extract mentioned users
- resolve URL redirects
These data will across all tweets will be summarised in the tweetstream objects.
Viewing a Tweetstream
To view a tweetstream, click on 'Manage Records', then 'Twitter Feed':
The above screen shows a list of all twitter streams that the logged in user has access to. Clicking on the view icon (the magnifying glass) will bring up the view screen, which shows all metadata set on this twitter feed. At the top of the page will be a link to the tweetstream's page. It will be of the form:
http://repository.foo.com/id/tweetstream/5
Below is an example of a tweetstream page:
Exporting
Due to the architecture of the twitter feeds (see below), exporting using the standard eprints exporters (e.g. XML) will only work if both the tweet dataset and the tweetstream dataset are both exported. For this reason, export plugins have been provided for tweetstreams. Currently, a tweetstream can be exported as:
- CSV
- HTML
- JSON
Architecture
Both tweetstreams and tweets are EPrints Data Objects. Each tweet object stores the ID of all tweetstreams to which it belongs. This allows tweets to appear in more than one stream, but only be stored once in the database.
Permissions
The z_tweepository_cfg.pl file contains the following:
$c->{roles}->{"tweetstream-admin"} = [ "datasets", "tweetstream/view", "tweetstream/details", "tweetstream/edit", "tweetstream/create", "tweetstream/destroy", "tweetstream/export", ]; $c->{roles}->{"tweetstream-editor"} = [ "datasets", "tweetstream/view", "tweetstream/details:owner", "tweetstream/edit:owner", "tweetstream/create", "tweetstream/destroy:owner", "tweetstream/export", ]; $c->{roles}->{"tweetstream-viewer"} = [ "tweetstream/view", "tweetstream/export", ]; push @{$c->{user_roles}->{admin}}, 'tweetstream-admin'; push @{$c->{user_roles}->{editor}}, 'tweetstream-editor'; push @{$c->{user_roles}->{user}}, 'tweetstream-viewer';
This defines three roles. The admin role:
- Can create tweetstreams
- Can destroy tweetstreams
- Can see tweetstream details
- Can see the list of tweetstreams in 'Manage Records'
- Can view tweetstream abstract pages
- Can export tweetstreams
The editor role:
- Can create tweetstreams
- Can destroy tweetstreams that they created
- Can see details of tweetstreams that they created
- Can see the list of tweetstreams in 'Manage Records'
- Can view tweetstream abstract pages
- Can export tweetstreams
The viewer role:
- Can view tweetstream abstract pages (but need to know the URL)
- Can export tweetstreams