The installation for ghini.server just the standard installation procedure for any python3 program: first of all make sure you can create virtual environments, then create and activate the virtual environment where you will install the software, and finally install the software, either from the global Python Package Index, or from a custom source.
Please do not even think of asking me how to install and run ghini.server on Windows. Even if it is definitely possible (as said above, ghini.server is a standard Python package), it is just as definitely out of my sphere of interest, helping you with using Windows for serving data over the internet, it’s just “not done”, not only in my opinion. You can use ghini from a Windows client, even using the Microsoft Edge browser, but here I will only explain how to install ghini server on a unix server, and specifically GNU/Linux.
Get yourself a recent Python version. Now that Debian has released its 10th version with Python3.7, there’s no excuse for staying on anything older than that. Anyhow, the software works fine with Python3.5 (the one which came with Debian 9) and is continuously tested against Python 3.5, 3.6 and 3.7.
The other non-automatic step is installing
pip. Look for the details
somewhere else if you think you need more details. If you’re on Debian,
contained in the package
pip gets automatically installed when
you create your virtual environment. Keep in mind that the command to create virtual
environment is the not-too-mnemonic
python3 -m venv:
sudo apt-get install python3-venv python3 -m venv ~/.virtualenvs/ghini/ . ~/.virtualenvs/ghini/bin/activate pip install --upgrade pip
At the time of writing, ghini.server isn’t yet published as a PyPI package, so even if we describe the PyPI method, it might not work yet.
Decide whether to download the packaged software from the Python Package Index,
use the github version.
PyPI is as easy as:
pip install ghini.server
If using github, you will need
git, then you must decide whether you use your own fork,
or the official repository, and choose the branch you want to check out.
In short and in practice:
mkdir -p ~/Local/github/Ghini cd ~/Local/github/Ghini ( git clone email@example.com:Ghini/server.git || git clone https://github.com/Ghini/server.git ) cd server git checkout ghini-3.2 pip install -r requirements.txt
In either case,
pip install takes care of all dependencies, and you need take no further
As it is standard practice with any Django program, after installing the software, you need
to set up the database, subsequently to
migrate it, create the users you need, import
your data, then you can run the site and navigate to it.
The first step, to set up the database, you do this by tweaking the
file, or by copying it to a custom named settings file, respecting the format
ghini/settings_<name>.py. You would do the first if you only need one site, and the
second if you plan running multiple instances of ghini.server.
With the virtual environment active, run the command
./manage.py migrate. It will load
your default settings, or you append the option to the command:
To create users, the easiest would be to enter the django
shell, then to do something
along the lines:
from django.contrib.auth.models import User me, _ = User.objects.get_or_create(username='mario') me.set_password('mario') # or maybe something safer me.save
The standard way to run your site in development mode is to just run
runserver command. Specify on which port to serve your site, then
navigate locally to it.
All the above would work in Windows as well, the next steps are about going in production,
for which we advise the use of
nginx combined with
uwsgi, and a unix system.
publishing the site¶
The above method lets you navigate your data using only the Django server. This is not how you would run a production site, where requests first reach your web server, which takes care to redirect each part of the request to the appropiate code producing that part of the composite response.
ghini.server is based on
uwsgi. We advise an external
nginx server, so that all incoming web requests reache that first.
nginx so that it makes sure the clients are talking the secure https protocol.
Most requests will cause a further cascade in simpler requests. The rules in our
configuration establish whether
nginx will satisfy the request itself by serving static
data, or if it will redirect the request to a local
uwgsi socket, behind which there is
emperor mode and let it inspect the
uwsgi.d directory in the
ghini.server main directory.
Produce or update your
ini files by running the provided
corresponding to the content of the
/etc/nginx/site-enabled directory, for all site
definitions looking like ghini.server instances.
We have a main api for interacting with the database.
Each object has its URL, which really identifies the object (e.g.: plant #1 for accession 101 in year 2001):
Removing the object’s trailing identificator from the URL gives the class URL (e.g.: the plants collection):
The trailing slash is part of the URL, but the server will add it if it’s missing.
We organized the objects in three sections:
garden. There might come some day a
section, or we may reorganize in fewer sections, we will see. As of now, we
have these collections:
/taxonomy/rank/ /taxonomy/taxon/ /collection/accession/ /collection/contact/ /collection/accession/<code>/verification/ /garden/accession/<code>/plant/ /garden/accession/<code>/plant/<code>/propagation/ /garden/location/
Verifications and Plants only make sense in combination with an accession, so their collections are behind an accession code. Same for Propagations, which only make sense in relation with the mother plant.
Append a primary key to a collection URL, and you get the URL for an individual within the collection.
As far as their URLs are concerned,
contact have a
primary key which is a sequential number, with no semantics.
Accessions have their own accession code, Plants have a sequential plant code within the Accession they belong to, Verifications also have a unique sequential number within the Accession they describe. Propagations have a sequential number within their mother Plant.
If we generalize the database to model more than one garden, we will need to associate accessions to gardens, we will probably identify gardens with a stub, and will prepend accession urls with a garden stub code. As of now, we only deal with a single garden.
GET and her sisters
Collection URLs implement the
POST verbs, respectively for getting
the whole collection (or a selection thereof), and for adding an individual
object to the collection. These URLs get a
-list suffix in their Django
Individual URLs implement the
DELETE verbs, with
their obvious meanings, applying to the specific individual only. These
URLs get a
-detail suffix in their Django name.
Collections also have an URL for the empty html form, to be populated by
the user and posted to the server. The corresponding Django names have
Individual objects have more entry points, respectively for:
- The populated html form (django suffix
- A json data dictionary for the infobox (django suffix
- A dictionary with several representations for the same object (django suffix
- A json data dictionary with depending objects, and the definition of the
concept depends on the object. A Location considers the plants located
there as its depending objects, a Taxon its subtaxa and the accessions
verified to it. The result has the same shape as the dictionary returned
by a search. (django suffix
- A rendered html page with object pictures (django suffix
get-filter-tokens/ are the main query api entry point.
Both expect a
q parameter, which they interprets according to several
search strategies. Search strategies are described in some detail in the user
The result of a
get-filter-tokens/ request is a dictionary, where the keys
are the names of the collection in the result, and the values are tokens.
You get as many tokens as the non-empty collections matching your query.
The next step on the client side is to enter a loop to cash your tokens.
Each invocation of the
cash-token/<token>/ returns you a dictionary with
chunkholds the list of items.
expectspecifies the length of the expected complete set. One possible use is to update a progress bar.
donetells you whether this was the last chunk.
Attempting to cash a token which was already paid in full will provide the
empty result. Same will happen if you attempt to cash an invalid token. The
empty result is
If you are somewhat too quick in cashing a new token, the
could still be a large hard-coded value. The correct value is computed in a
separate thread, so the server can provide all tokens as soon as possible.
Tokens will expire after some delay in cashing them. This prevents queries to stay active in the system while not any more relevant.
For queries where you expect a small result set (less than ~70 elements), you
can may prefer the
filter/ entry point.
filter short-circuits this
process, providing the concrete result at once, in a dictionary having the
same external structure as the
get-filter-tokens result, one list of
objects per non-empty collection, and values as the above
One more entry point in this group is
count/, it accepts the same
get-filter-tokens, and returns a dictionary
with same external structure. The values in this case are the matching query
count(), plus a grand total under the key
__total__. You can use this
to decide whether to use
filter or the chunked approach
On the server side, executing a search corresponds to constructing one or more queryset. Each element in the queryset is subsequently converted into a dictionary, with the structure:
|inline:||The string shown in the result. It may contain html tags.|
|twolines:||Three elements to be shown in different parts of the client.|
|infobox_url:||The url to get the corresponding infobox.|
twolines entries are meant to be included in the
results box. The
infobox_url provides quick access to the URL where we
will get the infobox data, but you can just replace the trailing infobox/
part and replace with whatever other valid suffix. at the moment of writing,
the URLs implemented are form/, markup/, depending/.
importing from ghini.desktop¶
Please consider this work in progress, try out the instructions, and be prepared to ask for help or to open an issue if the present instructions do not work.
First of all: taxasoft-ghini is not complete, not yet. The current goal is to have it do something useful, and to be visible on-line, it does not (yet) substitute ghini.desktop. Not at all. Expect things to be exciting, but do not expect things to work out of the box.
Got this? Good, now let’s see how to copy your ghini.desktop collection into taxasoft-ghini!
- export your (complete) data to csv.
open ghini-1.0 again,
- create a new sqlite3 connection,
- let ghini create the database.
- import the data, this will again initialize the database.
the result of the above steps is an expendable sqlite3 database: this way whatever we do on it, it has zero impact on your original data.
remove all taxonomic information that is not used. we do this straight on the expendable database:
sqlite3 ghini.db delete from genus where id not in (select genus_id from species); delete from family where id not in (select family_id from genus); delete from genus_synonym where genus_id not in (select id from genus); delete from genus_synonym where synonym_id not in (select id from genus);
consider removing history too, it’s not imported anyway:
delete from history;
export your (reduced) data to csv.
this will take a fraction of the time for the previous export.
now to taxasoft-ghini¶
enter the directory of your check-out;
activate the virtual environment;
move any previous database out of the way;
create a new database and initialize it:
consider whether you also want the intermediate taxa, between ranks familia and genus. since importing this information takes rather long, it is not included in the ‘migration’ command. if you want this data, you must request the import explicitly, with:
have something else to do in the meanwhile, this will take no less than one full hour. on my laptop, writing to a sqlite3 database, it lasts 2 hours.
if you’re in a hurry, ask for a partial genus import, limiting to the genera in your trimmed database:
./manage.py import_genera_derivation --filter-genera <your genus.txt file>
you can repeat the command without filtering, whenever you know you’re not going to use the database for a couple of hours.
run the command:
./manage.py import_desktop <location of second export>
this will output as many
+as the objects it inserted, as many
.as the objects it already found in place. for species, a
vis added if the related species is at lower rank.
the genus list in particular, that should be just a sequence of dots. if it is not, it’s because you’re importing genera that were not created during the previous steps. that’s clearly not good and you should review your data.
the opposite goes for the species list: remember that with ghini reloaded fictive species are not any more needed. A dot tells you that the corresponding taxon was found in the database, at some higher rank.
it is normal that importing accessions takes longer: for each object we are creating not only the accession but also the verificaiton object that links the accession to the corresponding taxon.
create your superuser:
run your server:
I’m sure there will be errors. please open issues about them, and if you have a solution, propose it.