Difference between revisions of "User:Ckeene/Ideas for future versions of eprints"

From EPrints Documentation
Jump to: navigation, search
(Auto LOC classify)
(Long term : fit in with academic workflow)
Line 2: Line 2:
 
Academics make funding bids, do research, etc, then write a draft article, then submit to journal(s), get published. (actually I don't really have a clue what academics do, but hopefully this is a good guess!)
 
Academics make funding bids, do research, etc, then write a draft article, then submit to journal(s), get published. (actually I don't really have a clue what academics do, but hopefully this is a good guess!)
  
* ''All of these are examples of creating documents and revising documents and sharing documents. This isn't too far from what a repository is.'' [LES CARR]
+
* ''All of these are examples of creating documents and revising documents and sharing documents. This isn't too far from what a repository is. We have thought about helping integrate with the desktop by supporting WebDAV.'' [LES CARR]
  
 
At the moment we 'bolt on' to the end of this, but what if we could play a bigger role.  
 
At the moment we 'bolt on' to the end of this, but what if we could play a bigger role.  

Revision as of 12:01, 19 April 2008

Long term : fit in with academic workflow

Academics make funding bids, do research, etc, then write a draft article, then submit to journal(s), get published. (actually I don't really have a clue what academics do, but hopefully this is a good guess!)

  • All of these are examples of creating documents and revising documents and sharing documents. This isn't too far from what a repository is. We have thought about helping integrate with the desktop by supporting WebDAV. [LES CARR]

At the moment we 'bolt on' to the end of this, but what if we could play a bigger role.

This may be controversial as in a sense it moves eprints away from being just a repository, and perhaps maybe instead of any extra functionality being put in to eprints, we look at what popular 'research process management' solutions are out there and see how we can work with them (integrate, api's etc). If we had a tool that helps people manage their biids and funding awards, lets them refer to old ones, and see the documents they produced as a result (which - just so happened to make those documents available to the world) then this may well appeal to many academics. Of course at this point eprints stops being what it is and becomes something else, so any moves to accommodate such things would need to be carefully thought out).

Allow users to be authenticated by an external system

See my email message from April 2008.

At the moment eprints can be configured to use an LDAP server for authentication. Eprints takes the user credentials and then passes them to the LDAP server for authentication.

Perhaps a more generic extension to this is to allow the our sourcing of authentication completely.

E.g. when authentication is needed, the user is redirected to a predefined webpage (probably on the Institutions website) which handles login, and can pass back to eprints a successful login and the username of the user who has logged in.

  • I think this is already what we do with eprints.ecs, as al authentication is handled by a single login page. I will check with Chris [LES CARR]

This is a little similar to what ezproxy can do http://www.oclc.org/us/en/support/documentation/ezproxy/usr/cgi.htm and also, in a way, to how shibboleth works.

The advantage is that it would allow a wide variety of campus authentication systems to be used to authenticate eprints, and eprints does not need to worry about handling password (and being secure with them, which is a key issue!).

Enhance the concept of authors (and editors)

Eprints has a type of person called a 'depositor'. they can log in, submit stuff, have a profile, list what they have submitted, etc. Nice.

But Authors as a concept does not really exist. they are just bits on text strings in the system. Sure, we can generate a view of all authors (often so large and with variant entries that it is of limited use).

Having an 'object' for authors would allow, for example, one REAL Author to have multiple display names, so that if one paper has them down as "C Keene" and another as "Prof C.J.Keene" eprints can still treat them as the same person, even if they show up differently on different papers. It allows for clever things, like allowing other systems to have a list of one person's research (not just research that so happens to have the same name attached to it, how many John Smiths work at your Uni?). And perhaps in the future it would allow 'Authors' (not just depositors) to login and see everything of theirs in the repository (everything they have co-authors, not just deposited) and 'do stuff with it'.

  • But you can already do all these things. That's why we have ids attached to authors, not just their names. You can View by Author IDs, and produce complete lists of an individual's research. As you point out there are USER objects which you could provide a bigger role for, beyond identifying the depositor as they do now. [LES CARR]

Enhance the eprint abstract page (a la flickr)

the Eprints team briefly demonstrated this at OR08. It looked great.

Basically two things:

  • At the moment many people come in via google, land on a page and that is it. If the Abstract page had 'see similar items' 'people who looked at this also looked at', plus tags etc for enhanced navigation. As well as these autogenerated facilities, perhaps something where repository managers can say 'if a record belongs to chemistry then advertise item x, or display this promotional image/text).
  • Something I know has been discussed is the idea that it is the item that is import. for documents, a large preview, for images and movies, they should be shown on the page. Again, I know people often refer to flick as a way of showing the content, with the metadata still visible but not dominant, and I would add my vote in saying this sounds like the right direction.

One Repository, multiple views (long term?)

At the moment eprints has the model: One repository database has one user interface.

I think there is scope in changing this to one repository, many interfaces. I include deposit/user area, branding, and certain config in with 'interface'.

  • Some Universities have created separate etheses repositories, surely it would be better if everything was in one database, but they were able to create a separate interface. So that theses would see a user area just for them, with unique branding, etc. Postgrad Office may want a unique deposit/management process for theses and not just the generic deposit process.
  • Some Universities have semi-autonomous units who have their own branding and organisation. These units may toy with a respository, but want to do it there way, with their deposit process, branding, a search just for their items. etc (but may be happy with the idea that their research shows up in the main repository as well. this would allow them to create a second interface to do accommodate them, rather than them creating their own repository.
  • Repository Managers go to departments to try and encourage them to make use of the IR. A Department may be interested, but again, want specific browse views and searches that can appear on their departmental pages, (they may, for example hate the idea of unpublished items showing up with published ones, or only want their items to show). The ability to say 'sure we can set that up' is a huge plus.

Of course some of this is technically possible (creating a search form which only searchers certain criteria or creating extra browse views), but I think a more comprehensive approach could have some potential. I guess what I am saying is that to all intents and purposes, the public and users would see two (or more) sperate repositories, but behind the scenes is just one. In fact, situations like the Soton/ECS dual setup could potentially be nicely accommodated with this!

  • There is a general issue of making special collections a lot easier to manage, and that is something that we are working on. (That will support research groups and journals as well as scholarly collections.) However the issue of allowing groups to have their own listings on their own pages is often better supported by allowing them to import blank (unbranded) listings from the repository which are then branded by the local site using stylesheets. This puts control of the department/research group's pages into their hands, which makes them happy. [LES CARR]

Auto LOC classify

Eprints has a simple LOC based classification system setup by default. This is good, but researchers don't find LOC that easy to decide which category to use, and find it all a little complex.

If there was some way eprints could use some web based service to find a suitable LOC heading to suggest to the user, based on the journal/book title. (i.e. if there is a web service that can be passed a title or issn/isbn and return a LOC heading for that journal/book).

  • We attempted this in an old JISC Project called "EPrints UK" using Dewey instead of LOC, but it relied on an experimental service from OCLC. If anyone could point us at such a service for LoC we would happily use it! [LES CARR]

Auto submit to other repositories

"why should I deposit in the IR when I have arxiv.org where all my peers submit to as well?" "If I have to deposit to my Research Councils repository* why should I deposit to the IR as well?"

Good questions! and what if the answer was "By submitting to the IR it will automatically be uploaded to the other repository just by ticking a box"

With SWORD this should now finally be possible. This could be a killer feature and I think it could do more than just support the protocol. It should use logic to recommend other repositories that the user can just tick for it to try and submit the item to as well i.e. if user is a physicist suggest arXiv, or if item has been categorized as 'Econmics' then suggest repec, or if it has a co-author at Watford Gap University then suggest that it is deposited their as well.

Refine search / facet search

See an example on the left of this worldcat page.

On search results, allow people to refine by year, author, department, type, with-full text, published, peer review, etc.


Make document available in other formats

Similar in a way to the first idea. If someone deposits a PDF/PS/RTF then make it available in a number of file types, either converting it on the fly or perhaps converting it as it is deposited (use more space, but may have archival plus points'.


These are just my personal thoughts, partly based on feedback we've received from academics. Though the last idea was mainly due to Citeseer making documents availabele in many formats (e.g. http://citeseer.ist.psu.edu/lagoze01open.html) Like I say, these are all non-trivial, but hope it makes for good food for thought!

Things that have made it in to Eprints 3.1

Auto covert MS Word to PDF

[April 08n update: this looks like it is a plugin in 3.1, excellent]

The ability to let users upload a Word file and eprints to turn it in to a PDF (and store both) would make help many users and make the system more attractive to use.

Telephone call..

Humanities academic: "err hi, I've hear you're the people to contact to put my research online, how do I go about it?"

eprints admin: "cool, you just need to find you final draft of your article, covert it to PDF, using any PDF tool such as Acrobat or PDFcreator, which you may need to install, if you have admin rights on your PC, and then...." [sounds of despair from other end of phone, caller hangs up]

Of course, academics hardly ever phone up with such enthusiasm, but this talk of complicated stuff doesn't make it easy. If there's some sort of *nix library/tool out there which eprints could use to convert on the fly MS Word files to PDF, then this would make it so much easier from the researcher's point of view. They just need to upload their Word file, and it's made available as a PDF to users.