Installing and configuring EPrints: hands-on workshop

In this session you will:

The intention is to give you some confidence using the system and finding your way around. Feel free to ask the assistants for help, or about anything you find confusing or interesting.

Don't worry about finishing everything. If you get stuck: ask!

To make this session practical we have pre-installed some linux boxes with the required perl modules, mysql and apache 2. You can SSH to your linux box with the username "epdemo". The password will be given to you by the assistants.

Your machine is:

                                          .ecs.soton.ac.uk

Log in to it now!

We've already provided an apache that is preconfigured to run with eprints (on port 8080, running as user "epdemo", group "epdemo" and the configuration already has the line "Include /opt/eprints2/cfg/apache.conf" in it) so you don't have to do that bit.

The apache files are in /opt/httpd8080

Part One: Unpack and install eprints:

Enter the following UNIX commands:

cd /opt/
tar xzvf eprints.tgz
cd eprints-2*
./configure --with-apache=2 -with-user=epdemo -with-group=epdemo --disable-diskfree
./install.pl
cd /opt/eprints2

Now just check it's really installed:

ls

EPrints is now installed.

What we've missed

We're skipping some stages of installing EPrints:

Part Two: Create an archive:

/opt/eprints2/bin/configure_archive

This program asks a whole load of questions:

Archive ID:
myid (any lowercase string, no spaces!)
Hostname:
the full address of the machine you are logged into, eg. footle.ecs.soton.ac.uk
Port:
8080
Alias:
leave blank
Admin Email:
use your own!
Archive Name:
up to you (free text)
Database Name:
use the default
MySQL Host:
use the default
MySQL Port:
use the default
MySQL Socket:
use the default
MySQL User:
use the default
MySQL password:
pick an eight character password (random letters is good). This is used by the eprints software to connect to mysql. You won't need to remember it.
Create database:
yes
DB root password:
use default (no password!)
create config files:
yes
return to continue:
return!

configure_archive should have created a mysql database, a default config xml file in: /opt/eprints2/archives/myid.xml and made a default configuration and directories for the archive under /opt/eprints2/archives/myid/

Create the SQL tables

/opt/eprints2/bin/create_tables myid

Import the "subjects" table (Library of congress is the default). The file is imported from /opt/eprints2/archives/myid/cfg/subjects

The --force option just stops it asking "are you sure?"

/opt/eprints2/bin/import_subjects myid --force

Copy the static images and webpages from the cfg dir to the archive website. Applies the template to the webpages.

/opt/eprints2/bin/generate_static myid

Create the apache config files for all the archives. Note that this generates all the apache config files for all eprints archives on this server, so myid is not needed.

/opt/eprints2/bin/generate_apacheconf

Start the eprints indexer. It runs in background to index the words in fields and documents.

/opt/eprints2/bin/indexer start

Start apache (on port 8080)

note, this is a local setup of apache just for the demo, you'd usually start and stop it as the root user, but we can't let you be root on our machines!

/usr/sbin/httpd -f /opt/httpd8080/conf/httpd.conf

Now open a web browser and goto http://footle.ecs.soton.ac.uk:8080/ (where footle is your machine name). If it's not working feel free to ask for help.

If you want to stop the webserver use:

killall httpd

That's not really a good way to do it normally, but handy for our demo session.

Create an administration user

To create an administrator user called "adminuser" with password "secret" do this:

/opt/eprints2/bin/create_user myid adminuser fred@example.com admin secret

Make sure you can log into the "user area" of the new site using the admin users account. You'll have to provide a valid name for "adminuser" before it will let you continue. By default users without names may not do anything!

What we've missed

We're skipping some stages of setting up an archive. They are:

Part Three: Changing the Template and Stylesheet

To do this you are going to need to edit some of the configuration files for our new archive. To do this you will need to use a text editor. vi, emacs and pico are available, maybe others.

Every page on eprints is wrapped in the site template. This can be found at /opt/eprints2/archives/myid/cfg/template-en.xml

The archive website can be split into four sections static, dynamic, views and abstracts. Each type of page is generated in a different way. When you change the style and want to relfect that change in the site you need to know how to get eprints to update the part of the site in question.

Static Pages

These are pages which do not change. They are built by the "generate_static" command which copies files from the directories /opt/eprints2/archives/myid/cfg/static/general/ and /opt/eprints2/archives/myid/cfg/static/en/ into the live website directory /opt/eprints2/archives/html/en

The "en" indicates english language files. The /general/ contains non-language specific files like icons and the style sheet.

Files in the cfg/static/en with the suffix .xpage are XML files containing only the contents of the page. Rather than just copy these, generate_static applies the site template to these and renames them to .html in the website dir.

Dynamic Pages

Everything on the website which starts with /perl/, such as the latest deposits page, all the submission pages and the search pages.

These pages are created by mod_perl. The mod_perl part of eprints (the web server) only loads the config files once when you start apache.

To force the webserver to reload the configuration, either stop and start it, or run "force_config_reload" (although this is inefficient, it's handy if you're making lots of little changes.

"View" Pages

Everything on the website which starts with /view/

These are the browse-by-subject, browse-by-year etc, pages. They are built by the script "generate_views" either on the command line or as an automated cron job.

We are going to ignore view pages to keep the workshop simple.

"Abstract" Pages

The pages which describe a record. eg. http://eprints.soton.ac.uk/44/

These are generated by the webserver part of eprints (see dynamic pages, above) but can be generated on the command line using generate_abstracts.

We are also going to ignore abstract pages in todays workshop.

Changing the Web Site Template

Edit /opt/eprints2/archives/myid/cfg/template-en.xml

Add a horizontal ruler under the title: find the "h1" element and on the next line add <hr />

Note that this file is XML so all tags must be closed correctly.

If you feel confident, make some more interesting changes too.

run force_config_reload and generate_static to apply your new template to the static and dynamic pages. note that we are skipping generate_views and generate_abstracts pages.

/opt/eprints2/bin/generate_static myid
/opt/eprints2/bin/force_config_reload myid

Changing the Stylesheet

Edit /opt/eprints2/archives/myid/cfg/static/general/eprints.css

Edit the "background" value for .header and .footer to be #ccffcc (light green).

Make more changes if you like.

Copy the new stylesheet into the live site:

/opt/eprints2/bin/generate_static myid

Have a shufty around the website using a web browser to see your changes!

Part Four: Adding a Field

There's an easy way and a hard way to add a metadata field. The hard way is to add it when you've already got data you care about in your archive, because you have to add it to the config files then alter the sql tables by hand to accomodate it. (There's a page on wiki.eprints.org if you ever need to know how to do that).

But for today, the easy way: we edit the config files then run "erase_archive" to destroy the database and website (but NOT the configuration). Then run create_tables, import_subject etc. again.

We are going to add a new field "colour of cover" to eprints of type book and book_section.

EPrint types are described in /opt/eprints2/archives/myid/cfg/metadata-types.xml

We need a database field name for this field. Use cover_col.

Add the configuration for the field type and basic properties

Edit the fields configuration file: /opt/eprints2/archives/myid/cfg/ArchiveMetadataFieldsConfig.pm

Under "full_text_status" add this config:

        { name => "cover_col", type=> "set", 
                        options => [ "red", "green", "blue", "other" ] },

That tells eprints that there's a field of type "set" with those options. Now we need to add the human readable names and descriptions. These go in the archive-specific phrase file. There's also one for all of eprints which contains all the phrases used in the interface, except for those which are archive specific.

Add the human-readable text for the new field

Edit: /opt/eprints2/archives/myid/cfg/phrases-en.xml

Add the following phrases:

    <ep:phrase ref="eprint_fieldname_cover_col">Colour of Book Cover</ep:phrase>
    <ep:phrase ref="eprint_fieldhelp_cover_col">Select the approximate colour of the cover of the book.</ep:phrase>
    <ep:phrase ref="eprint_fieldopt_cover_col_red">Mostly Red</ep:phrase>
    <ep:phrase ref="eprint_fieldopt_cover_col_green">Mostly Green</ep:phrase>
    <ep:phrase ref="eprint_fieldopt_cover_col_blue">Mostly Blue</ep:phrase>
    <ep:phrase ref="eprint_fieldopt_cover_col_other">Other</ep:phrase>

Keywords like "cover_col" and "blue" are used in the database and not seen by the users. The phrases are the cosmetics of your new metadata field.

Add the field to some eprint-record-types

Now you need to add the metadata field to one or more types of record.

Edit: /opt/eprints2/archives/myid/cfg/metadata-types.xml

Make the cover colour optional for records of type "book_section" but required for type "book"... First find the book_section eprint type. It starts with: <type name="book_section">

Add this anywhere in the "type". Where you add it controls what where and on what page in the submission form the field appears:

      <field name="cover_col" />

Now find type "book" and add:

      <field name="cover_col" required="yes" />

Note: re-creating the database tables and restarting the indexer is only needed because you made a significant change to ArchiveMetadataConfig. Normally restarting apache is enough to make a config. change take effect.

Erase and rebuild the database

OK, that's the configuration done. Now erase your website and database, and rebuild it:

/opt/eprints2/bin/erase_archive myid --force

MySQL root password is blank (just hit return).

/opt/eprints2/bin/create_tables myid
/opt/eprints2/bin/import_subjects myid --force
/opt/eprints2/bin/generate_static myid

There is no need to rerun generate_apacheconf as you've not changed anything to do with the actual serving of the website.

Make the indexer and Apache load the new config

Now start and stop the webserver and indexer so they get the new configuration.

/opt/eprints2/bin/indexer stop
/opt/eprints2/bin/indexer start
killall httpd
/usr/sbin/httpd -f /opt/httpd8080/conf/httpd.conf

And create an admin user (again). Erase archive also destroyed all user data too!

/opt/eprints2/bin/create_user myid adminuser fred@example.com admin secret

See your new field

On the website:

And if that all works first time give yourself a pat on the back!