Automating Tasks en Procedures
From DoKSwiki
This page shows some "code building blocks" and techniques to help you understand the scripts that are included in your DoKS installation and to help you start writing your own scripts.
Importing Data
You can import data from a tab separated (or otherwise separated) text file, e.g. exported from a spreadsheet program like MS Excel or OpenOffice Calc.
A good example is importing existing document and author metadata: at KHK (http://www.khk.be) we import all students at the beginning of each year. Our import script creates a ETD record, Author record and User for each student and creates the proper links and access rights.
Below is a description of the different steps involved.
In short: we read student information records line by line from a text file and for each student record we create a user, Author and ETD.
Access and read a text file
Create a Document record with an appropriate description (like "import") in an appropriate folder (like "/Home/Administration/Import") and attach the file to it, in fact just like you would create an ETD with an attached pdf.
In a script you can then access the folder:
importPath = "/Home/Administration/Import"; importFolder = folderFacade.getFolderDTOByPath(importPath);
Get the list of records in the folder:
records = folderFacade.getRecordLinks(importFolder.id);
Find the appropriate record:
recordDescription = "import";
importRecordDto = null;
recordIter = records.iterator();
while (recordIter.hasNext()) {
link = recordIter.next();
if (recordDescription.equals(link.getDescription())) {
importRecordDto = recordFacade.getRecordDTO(link.id);
}
}
if (importRecordDto == null) {
throw new Exception("could not find the record with description: " + recordDescription);
}
Open the text file:
fileName = "import2005.txt";
encoding = doks.config.Config.getConfig().getProperty("importexport.encoding"); //get the preferred encoding from doks.properties
lineNumberReader = null;
try {
file = importRecordDto.getDocuments().getFileByName(fileName).getFile();
reader = new InputStreamReader(new FileInputStream(file), encoding);
lineNumberReader = new LineNumberReader(reader);
} catch (e) {
throw new Exception("could not open file with name: " + fileName);
}
Read and process the contents line by line:
while ((line = lineNumberReader.readLine()) != null) {
//only read lines that are not empty and do not start with '#' (indicating a comment line)
if (!line.startsWith("#") && (line.trim().length() > 0)) {
//your processing code for each line here
}
}
Process a tab separated line of text
All the code here goes into the while loop defined above, except when indicated.
Supposing that each line of text has a fixed number of tab separated fields (the columns in your original spreadsheet table), you can easily parse the line:
columns = line.split("\t", -1);
Replace "\t" (the tab character) if you want to use another separator like ";" or "|".
And then assign the content of the fields to variables:
studentNr = columns[0].trim(); name = columns[1].trim(); firstName = columns[2].trim(); email = columns[3].trim(); //...etc...
Variables, numbering and sequence of the columns will depend on the precise format of your import file.
The next thing to do is creating the appropriate records. In our case Author, User and ETD records.
Note: multiline records
In case you have fields containing carriage return/new line (e.g. when you have a field containing an abstract) one row will not equal one line of text. In this case the parsing code gets a little more complicated.
First place this initialization code outside the while loop:
columns = new String[0]; int FIELDS = 17; //the number of fields (columns) we expect to find.
This is the multiline parsing code:
newcolumns = line.split("\t", -1);
//some fields have \r\n, so the records are multiline.
if (newcolumns.length < FIELDS) {
if (columns.length > 0) {
//make 1 shorter because we merge the last field of columns with the first of newcolumns
temp = new String[columns.length-1 + newcolumns.length];
}else{
temp = new String[columns.length + newcolumns.length];
}
System.arraycopy(columns, 0, temp, 0, columns.length);
int from = columns.length;
if (columns.length > 0) {
//merge the last field of columns with the first of newcolumns
newcolumns[0] = columns[columns.length - 1] + "\r\n" + newcolumns[0];
from--;
}
System.arraycopy(newcolumns, 0, temp, from, newcolumns.length);
columns = temp;
if (columns.length < FIELDS) continue; //here we continue the while loop, reading lines as long as there are not enough fields.
}else{
columns = newcolumns;
}
Then create a new columns array at the end of the processing while loop:
columns = new String[0];
Create a User
try{
userDto = new UserDTO(studentNr, "StudentInfo");
userDto.setFullName(firstName + ' ' + name);
userDto.setEmail(email);
userDto.setPassword("default_password");
userDto.setLanguage("nl");
userFacade.createUser(userDto);
userFacade.setLoginConfiguration(userDto.id, "");
} catch(DuplicateNameException e){ //the user already exists
userDto = userFacade.getUserDTO(userFacade.getUserByName(studentNr));
}
For more information about userFacade.setLoginConfiguration see LoginConfiguration
Create an Author record
To ensure unique Author records for people with identical names the Author structure should have a uuid field to hold a unique identifier (like the studentnumber if that is unique over the years).
authorUID = "" + studentNr;
authors = recordFacade.findRecords("Author", "record.uuid='" + authorUID + "'");
if (authors.isEmpty()) { //the author does not exist yet, create it
author = recordFacade.createRecordDTO("Author");
author.setDescription(name + ", " + firstName);
author.setUuid(authorUID);
}else{ //author already exists, get it
authorId = authors.iterator().next().id;
author = recordFacade.getRecordDTO(authorId);
}
This code can be enhanced e.g. by putting each Author into an alphabetical folder according to the first character of the name like "/Home/Authors/A", "/Home/Authors/B", etc...
In the beginning of your script, before reading the text file you should get the "/Home/Authors" folder:
authorFolder = folderFacade.getFolderDTOByPath("/Home/Authors");
Then put the code below in the if clause of the Author creation code above:
alfaFolderName = name.substring(0, 1).toUpperCase(); alfaFolder = folderFacade.getSubfolderDTOByName(authorFolder.id, alfaFolderName); recordFacade.createRecord(alfaFolder.id, author);
Create an ETD record
First create the DTO (Data Transfer Object) for the ETD:
etd = recordFacade.createRecordDTO("MY_ETD"); //MY_ETD should be the structure name for your ETD record type as defined in structures.xml
etd.publisher= "My School";
etd.grantor= "<a href='http://www.myschool.com'>My School</a>";
etd.type= "Electronic Thesis or Dissertation";
etd.language.add("dut");
etd.rights= "All rights reserved";
etd.status= "INIT";
etd.description= "Your title here";
etd.publicationDate= Integer.parseInt(year);
//... fill in all the information that you imported
In the beginning of your script, before reading the text file you should get/create the "/Home/Years/2006" folder:
String year = 2006;
folderPath = "/Home/Years/";
try{
yearFolder = folderFacade.getFolderDTOByPath(folderPath + year);
} catch(e){ //folder does not exist, create it
folderFacade.createFolder(path, null, false);
yearFolder = folderFacade.getFolderDTOByPath(folderPath + year);
yearFolder.addStructureName("ETD");
folderFacade.updateFolder(yearFolder);
}
Then create the ETD record in the year folder, this creates the actual record in the database:
recordFacade.createRecord(yearFolder.id, etd);
Set ETD access rights
Set the permissions for the ETD and its collections of files (fulltext and appendices):
etd.setPermission(Constants.EVERYONE, Constants.READ, true); etd.setPermission(Constants.GROUP, Constants.READ, true); etd.setPermission(Constants.GROUP, Constants.WRITE, true); etd.setPermission(Constants.OWNER, Constants.READ, true); etd.setPermission(Constants.OWNER, Constants.WRITE, true); etd.setUserGroup(groupDto.toLink()); accessControlFacade.updatePermissions(etd); fulltext= etd.fullText; fulltext.setPermission(Constants.EVERYONE, Constants.READ, true); fulltext.setPermission(Constants.GROUP, Constants.READ, true); fulltext.setPermission(Constants.GROUP, Constants.WRITE, true); fulltext.setPermission(Constants.OWNER, Constants.READ, true); fulltext.setPermission(Constants.OWNER, Constants.WRITE, true); fulltext.setUserGroup(groupDto.toLink()); accessControlFacade.updatePermissions(fulltext); appendices= etd.appendices; appendices.setPermission(Constants.EVERYONE, Constants.READ, true); appendices.setPermission(Constants.GROUP, Constants.READ, true); appendices.setPermission(Constants.GROUP, Constants.WRITE, true); appendices.setPermission(Constants.OWNER, Constants.READ, true); appendices.setPermission(Constants.OWNER, Constants.WRITE, true); appendices.setUserGroup(groupDto.toLink()); accessControlFacade.updatePermissions(appendices);
Create a link between Author and ETD
Add the Author to the ETD's authors collection and add the ETD to the Author's work collection
etd.authors.add(author.toLink()); recordFacade.updateRecord(etd); author.work.add(etd.toLink()); recordFacade.updateRecord(author);
Complete example scripts
You should now be able to easily understand the example scripts and create your own customized import script.
Example 1
This first example does not create users, it just imports ETDs and Authors. It can handle linebreaks in records in the import file.
#This is an example import file #Titel SubTitel Jaartal Voornaam Auteurs.Naam PagGenummerd PagNietGenummerd Abstract Opmerkingen Omschrijving Scholen.Naam Afkorting plaats discipline graad volumes awards This is the title of the first record and the subtitle 2004 Jane Doe 33 3 This is the abstract without line breaks. Boek My School MS Geel some science Bachelor in some science 1 This is the title of the second record and the subtitle 1993 Pete Petersen 73 13 This is the abstract with line breaks. This is the second line of the abstract with linebreaks of the second record Boek My School MS Geel some science Bachelor in some science 1
//This script creates an ETD record and an Author record for each student.
//If an author with the same name already exists, a new one will be created.
//The script expects the following:
// * a Document record with description "import"
// this record must be located in the folder with path "/Home/Administration"
// and must contain a file named "import.txt"
// * import.txt must have one line for each ETD containing following values (tab separated)
// Titel
// SubTitel
// Jaartal
// Voornaam
// Naam
// PagGenummerd
// PagNietGenummerd
// Abstract
// Opmerkingen
// Medium
// SchoolNaam
// SchoolAfkorting
// SchoolGemeente
// discipline
// graad
//
args = params.get(Constants.ARGS);
//HELPER METHODS
FolderDTO getFolder(String path){
FolderDTO folder = null;
try{
folder = folderFacade.getFolderDTOByPath(path);
} catch(e){
folderFacade.createFolder(path, null, false);
folder = folderFacade.getFolderDTOByPath(path);
folder.addStructureName("ETD");
folderFacade.updateFolder(folder);
}
return folder;
}
String getDegreeLevel(String degree){
if (degree.startsWith("Graduaat")) {
return "Graduaat";
}else{
return "Master";
}
}
//MAIN CODE
encoding = doks.config.Config.getConfig().getProperty("importexport.encoding");
// the folder where we will look for the student information
administrationPath = "/Home/Administration";
// the folder with author records
authorPath = "/Home/Authors";
// the folder with all the theses of this year
yearPath = "/Home/Years/";
// the name of the record with the student information (contains the year)
recordDescription = "import";
// the file to look for
fileName = "import.txt";
// look up the folders
authorFolderId = folderFacade.getFolderDTOByPath(authorPath).id;
adminFolder = folderFacade.getFolderDTOByPath(administrationPath);
recordIter = folderFacade.getRecordLinks(adminFolder.id).iterator();
adminRecordDto = null;
while (recordIter.hasNext()) {
link = recordIter.next();
if (recordDescription.equals(link.getDescription())) {
adminRecordDto = recordFacade.getRecordDTO(link.id);
}
}
if (adminRecordDto == null) {
throw new Exception("could not find the record with description: " +
recordDescription);
}
etdCount = 0;
studentCount = 0;
existingAuthors = 0;
lineNumberReader = null;
// open the file
try {
studentFile = adminRecordDto.getDocuments().getFileByName(fileName).getFile();
reader = new InputStreamReader(new FileInputStream(studentFile), encoding);
lineNumberReader = new LineNumberReader(reader);
} catch (e) {
throw new Exception("could not open file with name: " + fileName);
}
// process each line of te file that does't start with #
// each line contains: #Titel SubTitel Jaartal Voornaam Auteurs.Naam PagGenummerd PagNietGenummerd Abstract Opmerkingen Omschrijving Gemeente Scholen.Naam
columns = new String[0];
int FIELDS = 17;
while ((line = lineNumberReader.readLine()) != null) {
if (!line.startsWith("#") && (line.trim().length() > 0)) {
logger.debug("parsing line: " + line.substring(0, ((line.length()<20)?line.length():20)));
try {
newcolumns = line.split("\t", -1);
//some fields have \r\n, so the records are multiline.
if (newcolumns.length < FIELDS) {
logger.debug("kolommen " + newcolumns.length);
if (columns.length > 0) {
//make 1 shorter because we merge the last field of columns with the first of newcolumns
temp = new String[columns.length-1 + newcolumns.length];
}else{
temp = new String[columns.length + newcolumns.length];
}
System.arraycopy(columns, 0, temp, 0, columns.length);
int from = columns.length;
if (columns.length > 0) {
//merge the last field of columns with the first of newcolumns
logger.debug(newcolumns[0]);
newcolumns[0] = columns[columns.length - 1] + "\r\n" + newcolumns[0];
from--;
logger.debug(newcolumns[0]);
}
System.arraycopy(newcolumns, 0, temp, from, newcolumns.length);
columns = temp;
logger.debug(columns.length);
if (columns.length < FIELDS) continue;
}else{
columns = newcolumns;
}
title = columns[0].trim();
subtitle = columns[1].trim();
year = columns[2].trim();
firstName = columns[3].trim();
lastName = columns[4].trim();
numberedPages = columns[5].trim();
unnumberedPages = columns[6].trim();
abstract = columns[7].trim();
remarks = columns[8].trim();
materials = columns[9].trim();
institutionName = columns[10].trim();
institutionCode = columns[11].trim();
institutionPlace = columns[12].trim();
discipline = columns[13].trim();
degree = columns[14].trim();
volumes = columns[15].trim();
award = columns[16].trim();
//create the author record
fullName = lastName + ", " + firstName;
authorUID = institutionCode.toLowerCase() +"/"+ year +"/"+ lastName.toLowerCase() +"/"+ firstName.toLowerCase();
author = null;
authors = recordFacade.findRecords("Author", "record.uuid='" + authorUID + "'");
if (authors.isEmpty()) { //the author does not exist yet
logger.debug("creating author");
author = recordFacade.createRecordDTO("Author");
author.setDescription(fullName);
author.setUuid(authorUID);
letterFolder = folderFacade.getSubfolderDTOByName(authorFolderId, lastName.substring(0, 1).toUpperCase());
recordFacade.createRecord(letterFolder.id, author);
logger.debug("author created: " + fullName);
} else { //author already exists
logger.warn("author exists!: " + fullName + " " + authorUID);
authorId = authors.iterator().next().id;
author = recordFacade.getRecordDTO(authorId);
existingAuthors++;
}
logger.debug("creating etd");
etd = recordFacade.createRecordDTO("BIB_ETD");
etd.publisher= "VVBAD";
etd.grantor= institutionName + ", " + institutionPlace;
etd.type= "eindwerk";
etd.language.add("dut");
etd.rights= "All rights reserved";
etd.status= "PUBLISHED";
etd.description= title;
etd.publicationDate= Integer.parseInt(year);
etd.authors.add(author.toLink());
etd.degreeName= degree;
etd.department= discipline;
etd.degreeLevel= getDegreeLevel(degree);
etd.discipline = discipline;
etd.award = award;
etd.abstract = abstract;
//BIB_ETD velden
etd.subTitle = subtitle;
etd.volumes = Integer.parseInt(volumes);
etd.collation = numberedPages + " p.";
etd.notes = materials + "\n" + remarks;
etd.embargo = false;
etd.publicationPlace = institutionPlace;
yearFolder = getFolder(yearPath + "/" + year);
recordFacade.createRecord(yearFolder.id, etd);
etd.setPermission(Constants.EVERYONE, Constants.READ, true);
etd.setPermission(Constants.OWNER, Constants.READ, true);
etd.setPermission(Constants.OWNER, Constants.WRITE, false);
etd.setOwner(null);
accessControlFacade.updatePermissions(etd);
fulltext= etd.fullText;
fulltext.setPermission(Constants.EVERYONE, Constants.READ, true);
fulltext.setPermission(Constants.OWNER, Constants.READ, true);
fulltext.setPermission(Constants.GROUP, Constants.READ, true);
accessControlFacade.updatePermissions(fulltext);
appendices= etd.appendices;
appendices.setPermission(Constants.EVERYONE, Constants.READ, true);
appendices.setPermission(Constants.OWNER, Constants.READ, true);
appendices.setPermission(Constants.GROUP, Constants.READ, true);
accessControlFacade.updatePermissions(appendices);
author.work.add(etd.toLink());
recordFacade.updateRecord(author);
logger.debug("etd created: " + title);
etdCount++;
studentCount++;
columns = new String[0];
} catch (Exception e) {
e.printStackTrace();
throw new Exception("error in line " + line.substring(0, 20) +
"... " + e);
}
}
}
(params.get(Constants.MESSAGES)).add("Created " + etdCount +
" theses for " + studentCount +
" students. " +
existingAuthors +
" authors already existed (see log for more information)");
Example 2
The second example imports ETDs and Authors, creates users and puts them into usergroups.
#This is an example import file #studentnummer naam voornaam email passwoord departement richting optie niveau studentnr groepsverantwoordelijke naam verantwoordelijke nummer administratieve groep code bib + departement jury eindwerknr volgorde 513785 Doe Jane S513785@school.com Departement Gezondheidszorg en Chemie Gegradueerde in Chemie Optie chemie Bachelor's 514000 PL-CH 2 6 2005001 1 514041 Diddy John S514041@school.com Departement Gezondheidszorg en Chemie Gegradueerde in Chemie Optie chemie Bachelor's 514000 PL-CH 2 6 2005002 1 514000 Joyce James S514000@school.com Departement Gezondheidszorg en Chemie Gegradueerde in Chemie Optie chemie Bachelor's 514000 PL-CH 2 6 2005002 1
//This script creates an ETD record and an Author record for each student.
//Each student will also have a user account and a workspace containing his ETD.
//If an author with the same name already exists, no new one will be created.
//The script expects the following:
// * a Document record with description "student information year", where year has to be replaced with the actual year
// this record must be located in the folder with path "/Home/Administration"
// and must contain a file named "students.txt"
// * students.txt must have one line for each student containing following values (tab separated)
// - student id (studentnummer)
// - name
// - first name
// - email
// - password
// - department
// - degree name (richting)
// - discipline (optie)
// - degree level (bachelor or master)
// - responsible student id (only for group work, blank otherwise)
//
//To execute the script, one must fill in the year in the field "Arg 0"
//To customize this script, look for "TODO"
args = params.get(Constants.ARGS);
// the user must enter the year in "Arg 0"
if ((args == null) || (args.length < 1) || (args[0].trim().length() == 0)) {
throw new Exception("you must specify the year");
}
encoding = doks.config.Config.getConfig().getProperty("importexport.encoding");
// the folder where we will look for the student information
administrationPath = "/Home/Administration";
// the folder with author records
authorPath = "/Home/Authors";
// the folder with all the theses of this year
yearPath = "/Home/Years/" + args[0];
// the name of the record with the student information (contains the year)
recordDescription = "student information " + args[0];
// the file to look for
fileName = "students.txt";
// look up the folders
authorFolderId = folderFacade.getFolderDTOByPath(authorPath).id;
adminFolder = folderFacade.getFolderDTOByPath(administrationPath);
fullTextGroup= userFacade.getUserGroupDTOByName("FullText");
try{
yearFolder = folderFacade.getFolderDTOByPath(yearPath);
} catch(e){
folderFacade.createFolder(yearPath, null, false);
yearFolder = folderFacade.getFolderDTOByPath(yearPath);
yearFolder.addStructureName("ETD");
folderFacade.updateFolder(yearFolder);
}
recordIter = folderFacade.getRecordLinks(adminFolder.id).iterator();
adminRecordDto = null;
while (recordIter.hasNext()) {
link = recordIter.next();
if (recordDescription.equals(link.getDescription())) {
adminRecordDto = recordFacade.getRecordDTO(link.id);
}
}
if (adminRecordDto == null) {
throw new Exception("could not find the record with description: " +
recordDescription);
}
// note: an etd must have a student that is responsible for it
// if there is only one student working on an etd, he is the responsible
// this map will contain etd.id for each group responsible (student id)
responsibleMap = new HashMap();
// this map will contain the student id of the group responsible for each student who is not the responsible himself
groupMembersMap = new HashMap();
// this map will contain the author.id for each student
authorMap = new HashMap();
etdCount = 0;
studentCount = 0;
existingAuthors = 0;
lineNumberReader = null;
// open the file
try {
studentFile = adminRecordDto.getDocuments()
.getFileByName(fileName).getFile();
reader = new InputStreamReader(new FileInputStream(studentFile), encoding);
lineNumberReader = new LineNumberReader(reader);
} catch (e) {
throw new Exception("could not open file with name: " + fileName);
}
// process each line of the file that does't start with #
// each line contains: student id, name, first name, email, password, department, degreeName, discipline, student id responsible
while ((line = lineNumberReader.readLine()) != null) {
if (!line.startsWith("#") && (line.trim().length() > 0)) {
logger.debug("parsing line: " + line.substring(0, 20));
try {
columns = line.split("\t", -1);
//we add an S in front of the studentnr
studentNr = 's' + columns[0].trim();
name = columns[1].trim();
firstName = columns[2].trim();
email = columns[3].trim();
password = columns[4].trim();
department = columns[5].trim();
degreeName = columns[6].trim();
discipline = columns[7].trim();
level = columns[8].trim();
groupResponsible = 's' + columns[9].trim();
//TODO: add your own specific fields here
groupResponsibleName = columns[10].trim();
adminNumber = columns[11].trim();
code = columns[12].trim();
bibCopies = columns[13].trim();
juryCopies = columns[14].trim();
thesisNumber = columns[15].trim();
userDto = new UserDTO(studentNr, "StudentInfo");
userDto.setFullName(firstName + ' ' + name);
userDto.setEmail(email);
userDto.setLanguage("nl");
userFacade.createUser(userDto);
userFacade.setLoginConfiguration(userDto.id, "studenten_khk");
//add student to the userGroup Students
UserGroupDTO studentsGroup = userFacade.getUserGroupDTOByName("Students");
if (studentsGroup!=null){
studentsGroup.addUser(new UserLink(userDto));
userFacade.updateUserGroup(studentsGroup);
}else{
throw new Exception("Create a usergroup Students first!");
}
workplaceId = userFacade.createWorkplace(userDto.id);
workplace = folderFacade.getFolderDTO(workplaceId);
workplace.addStructureName("ETD");
folderFacade.updateFolder(workplace);
workplace.setPermission(Constants.EVERYONE, Constants.READ, false);
workplace.setPermission(Constants.OWNER, Constants.READ, true);
workplace.setPermission(Constants.OWNER, Constants.WRITE, false);
workplace.setPermission(Constants.OWNER, Constants.ADD, false);
accessControlFacade.updatePermissions(workplace);
//create the author record
fullName = name + ", " + firstName;
author = null;
authors = recordFacade.findRecords("Author",
"record.description=?",
new Object[] { fullName },
new net.sf.hibernate.type.Type[] {
net.sf.hibernate.Hibernate.STRING
});
if (authors.isEmpty()) { //the author does not exist yet
logger.debug("creating author");
author = recordFacade.createRecordDTO("Author");
author.setDescription(fullName);
letterFolder = folderFacade.getSubfolderDTOByName(authorFolderId,
name.substring(0,
1)
.toUpperCase());
recordFacade.createRecord(letterFolder.id, author);
authorMap.put(studentNr, author.id);
logger.debug("author created: " + fullName);
} else { //author already exists
authorId = authors.iterator().next().id;
authorMap.put(studentNr, authorId);
author = recordFacade.getRecordDTO(authorId);
existingAuthors++;
}
if (groupResponsible.equals("s") ||
groupResponsible.equals(studentNr)) {
logger.debug("creating etd");
//TODO: create your specific ETD and fill in the fields
etd = recordFacade.createRecordDTO("KHK_ETD");
etd.publisher= "Katholieke Hogeschool Kempen";
etd.grantor= "<a href='http://www.khk.be'>Katholieke Hogeschool Kempen</a>";
etd.type= "Electronic Thesis or Dissertation";
etd.language.add("dut");
etd.rights= "All rights reserved";
etd.status= "INIT";
etd.description= "vul hier uw titel in";
etd.publicationDate= Integer.parseInt(args[0]);
etd.authors.add(author.toLink());
etd.degreeName= degreeName;
etd.degreeLevel= level;
etd.discipline= discipline;
etd.volumes= 1;
etd.copies= "Bib + departement:\t" + bibCopies + "\n" +
"Jury:\t" + juryCopies + "\n" +
fullName + " (" + studentNr + "), inclusief stageplaats:\t1\n";
etd.thesisNumber= Integer.parseInt(thesisNumber);
etd.adminNumber= Integer.parseInt(adminNumber);
etd.department= department;
etd.code= code;
recordFacade.createRecord(workplaceId, etd);
etd.setPermission(Constants.EVERYONE, Constants.READ, false);
etd.setPermission(Constants.OWNER, Constants.READ, true);
etd.setPermission(Constants.OWNER, Constants.WRITE, false);
etd.setOwner(userDto.toLink());
accessControlFacade.updatePermissions(etd);
fulltext= etd.fullText;
fulltext.setPermission(Constants.EVERYONE, Constants.READ, false);
fulltext.setPermission(Constants.OWNER, Constants.READ, true);
fulltext.setPermission(Constants.GROUP, Constants.READ, true);
fulltext.setUserGroup(fullTextGroup.toLink());
accessControlFacade.updatePermissions(fulltext);
appendices= etd.appendices;
appendices.setPermission(Constants.EVERYONE, Constants.READ, false);
appendices.setPermission(Constants.OWNER, Constants.READ, true);
appendices.setPermission(Constants.GROUP, Constants.READ, true);
appendices.setUserGroup(fullTextGroup.toLink());
accessControlFacade.updatePermissions(appendices);
author.work.add(etd.toLink());
recordFacade.updateRecord(author);
responsibleMap.put(studentNr, etd.id);
folderFacade.addRecord(yearFolder.id, etd.id);
logger.debug("etd created: " + thesisNumber);
etdCount++;
} else {
logger.debug("adding student: " + studentNr +
" with responsible: " + groupResponsible);
groupMembersMap.put(studentNr, groupResponsible);
}
studentCount++;
} catch (Exception e) {
e.printStackTrace();
throw new Exception("error in line " + line.substring(0, 20) +
"... " + e);
}
}
}
//for all students that are not the responsible:
//add them as author to the etd and add the etd to their workplace
groupIter = groupMembersMap.keySet().iterator();
while (groupIter.hasNext()) {
studentNr = groupIter.next();
authorId = authorMap.get(studentNr);
responsible = groupMembersMap.get(studentNr);
etdId = responsibleMap.get(responsible);
if (etdId == null) {
throw new Exception("incorrect responsible: " + responsible +
" for student: " + studentNr);
}
author = recordFacade.getRecordDTO(authorId);
etd = recordFacade.getRecordDTO(etdId);
etd.copies= etd.copies + author.getDescription() + " (" + studentNr +
"), inclusief stageplaats:\t1\n";
//remove the discipline for a group work (Sociaal Werk)
etd.discipline= null;
recordFacade.updateRecord(etd);
author.work.add(etd.toLink());
recordFacade.updateRecord(author);
workplaceId = folderFacade.getFolderDTOByPath("/Workplaces/" +
studentNr).id;
folderFacade.addRecord(workplaceId, etd.id);
//create a usergroup if it doesn't exist and grant every groupmember write permissions
group = userFacade.getUserGroupDTOByName("" + etd.thesisNumber);
if (group == null) {
group = new UserGroupDTO("" + etd.thesisNumber);
responsibleUser = userFacade.getUserDTOByName(responsible);
group.addUser(responsibleUser.toLink());
userFacade.createUserGroup(group);
}
user = userFacade.getUserDTOByName(studentNr);
group.addUser(user.toLink());
userFacade.updateUserGroup(group);
etd.setPermission(Constants.GROUP, Constants.READ, true);
etd.setPermission(Constants.GROUP, Constants.WRITE, true);
etd.setUserGroup(group.toLink());
accessControlFacade.updatePermissions(etd);
}
(params.get(Constants.MESSAGES)).add("Created " + etdCount +
" theses for " + studentCount +
" students. " +
existingAuthors +
" authors already existed (see log for more information)");
Example 3
This import script creates ETD's, Authors and users. Every thesis has more than 1 author (and corresponding user). The users for the same thesis record are put in a user group to be able to give them access rights to the record.
#studentnr name firstname email password department thesisnr discipline publisher degree 00001 Test1 vnaam1 doks@test.be test 8 AU200611 Automatisering Company1 Bachelor in de Elektromechanica 00002 Test2 vnaam2 doks@test.be test 8 AU200611 Automatisering Company1 Bachelor in de Elektromechanica 00003 Test3 vnaam3 doks@test.be test 8 AU200614 Automatisering Company2 Bachelor in de Elektromechanica 00004 Test4 vnaam4 doks@test.be test 8 AU200614 Automatisering Company2 Bachelor in de Elektromechanica
encoding = doks.config.Config.getConfig().getProperty("importexport.encoding");
administrationPath = "/Home/Administration";
authorPath = "/Home/Authors";
yearPath = "/Home/Years/2006";
recordDescription = "Studenten";
fileName = "studenten.txt";
authorFolderId = folderFacade.getFolderDTOByPath(authorPath).id;
adminFolder = folderFacade.getFolderDTOByPath(administrationPath);
yearFolder = folderFacade.getFolderDTOByPath(yearPath);
recordIter = folderFacade.getRecordLinks(adminFolder.id).iterator();
adminRecordDto = null;
while (recordIter.hasNext()) {
link = recordIter.next();
if (recordDescription.equals(link.getDescription())) {
adminRecordDto = recordFacade.getRecordDTO(link.id);
}
}
if (adminRecordDto == null) {
throw new Exception("could not find the record with description: " + recordDescription);
}
etdCount = 0;
studentCount = 0;
existingAuthors = 0;
failedCount = 0;
lineNumberReader = null;
try {
studentFile = adminRecordDto.getDocuments().getFileByName(fileName).getFile();
reader = new InputStreamReader(new FileInputStream(studentFile), encoding);
lineNumberReader = new LineNumberReader(reader);
} catch (e) {
throw new Exception("could not open file with name: " + fileName);
}
while ((line = lineNumberReader.readLine()) != null) {
if (!line.startsWith("#") && (line.trim().length() > 0)) {
logger.debug("parsing line: " + line.substring(0, ((line.length()<20)?line.length():20)));
try {
columns = line.split("\t", -1);
studentNr = columns[0].trim();
name = columns[1].trim();
firstName = columns[2].trim();
email = columns[3].trim();
password = columns[4].trim();
department = columns[5].trim();
thesisNumber = columns[6].trim();
discipline = columns[7].trim();
uitgever = columns[8].trim();
naamvandegraad = columns[9].trim();
userDto = new UserDTO(studentNr, "StudentInfo");
userDto.setFullName(firstName + ' ' + name);
userDto.setEmail(email);
// userDto.setPassword(password);
userDto.setLanguage("nl");
userFacade.createUser(userDto);
userFacade.setLoginConfiguration(userDto.id, "studenten");
//create the author record
fullName = name + ", " + firstName;
authorUID = "" + studentNr;
author = null;
authors = recordFacade.findRecords("Author", "record.uuid=?", new Object[] {authorUID}, new net.sf.hibernate.type.Type[] {net.sf.hibernate.Hibernate.STRING});
if (authors.isEmpty()) { //the author does not exist yet
logger.debug("creating author");
author = recordFacade.createRecordDTO("Author");
author.setDescription(fullName);
author.setUuid(authorUID);
letterFolder = folderFacade.getSubfolderDTOByName(authorFolderId, name.substring(0, 1).toUpperCase());
recordFacade.createRecord(letterFolder.id, author);
logger.debug("author created: " + fullName);
}
else
{ //author already exists
logger.warn("author exists!: " + fullName + " " + authorUID);
authorId = authors.iterator().next().id;
author = recordFacade.getRecordDTO(authorId);
existingAuthors++;
}
groupDto = userFacade.getUserGroupDTOByName("" + thesisNumber);
userGroupExists = true;
if (groupDto == null) {
userGroupExists = false;
groupDto = new UserGroupDTO("" + thesisNumber);
userFacade.createUserGroup(groupDto);
}
groupDto.addUser(userDto.toLink());
userFacade.updateUserGroup(groupDto);
if (!userGroupExists) {
etd = recordFacade.createRecordDTO("TEST_ETD");
etd.publisher= uitgever;
etd.grantor= "<a href='http://www.test.be'>Test Hogeschool</a>";
etd.type= "Thesis/Eindwerk";
etd.language.add("dut");
etd.rights= "All rights reserved";
etd.status= "INIT";
etd.description= "vul hier uw titel in";
etd.publicationDate= Integer.parseInt("2006");
etd.authors.add(author.toLink());
etd.department= department;
etd.thesisNumber= thesisNumber;
etd.discipline = discipline;
etd.degreeName = naamvandegraad;
recordFacade.createRecord(yearFolder.id, etd);
etd.setPermission(Constants.EVERYONE, Constants.READ, true);
etd.setPermission(Constants.GROUP, Constants.READ, true);
etd.setPermission(Constants.GROUP, Constants.WRITE, true);
etd.setPermission(Constants.OWNER, Constants.READ, true);
etd.setPermission(Constants.OWNER, Constants.WRITE, true);
etd.setUserGroup(groupDto.toLink());
accessControlFacade.updatePermissions(etd);
fulltext= etd.fullText;
fulltext.setPermission(Constants.EVERYONE, Constants.READ, true);
fulltext.setPermission(Constants.GROUP, Constants.READ, true);
fulltext.setPermission(Constants.GROUP, Constants.WRITE, true);
fulltext.setPermission(Constants.OWNER, Constants.READ, true);
fulltext.setPermission(Constants.OWNER, Constants.WRITE, true);
fulltext.setUserGroup(groupDto.toLink());
accessControlFacade.updatePermissions(fulltext);
appendices= etd.appendices;
appendices.setPermission(Constants.EVERYONE, Constants.READ, true);
appendices.setPermission(Constants.GROUP, Constants.READ, true);
appendices.setPermission(Constants.GROUP, Constants.WRITE, true);
appendices.setPermission(Constants.OWNER, Constants.READ, true);
appendices.setPermission(Constants.OWNER, Constants.WRITE, true);
appendices.setUserGroup(groupDto.toLink());
accessControlFacade.updatePermissions(appendices);
author.work.add(etd.toLink());
recordFacade.updateRecord(author);
logger.debug("etd created: " + thesisNumber);
etdCount++;
}
else
{
iter = recordFacade.findRecords("TEST_ETD", "record.thesisNumber=?", new Object[] {thesisNumber}, new net.sf.hibernate.type.Type[] {net.sf.hibernate.Hibernate.STRING}).iterator();
if (iter.hasNext()){
etd = recordFacade.getRecordDTO(iter.next().id);
etd.authors.add(author.toLink());
recordFacade.updateRecord(etd);
author.work.add(etd.toLink());
recordFacade.updateRecord(author);
}else{
logger.error("TEST_ETD with thesisNumber=" + thesisNumber + " expected but not found.");
failedCount++;
}
}
studentCount++;
} catch (Exception e) {
e.printStackTrace();
throw new Exception("error in line " + line+ "... " + e);
}
}
}
(params.get(Constants.MESSAGES)).add("Created " + etdCount +
" theses for " + studentCount +
" students. " +
existingAuthors +
" authors already existed. <br>" + failedCount + " times failed to add additional author to ETD (see log for more information)");
Exporting Data
Scripts can be used to export metadata in any chosen text based format (tab separated file for import in database or spreadsheet software, XML, plain text, ...)
Creating and writing to a file
A file inside DoKS:
//create a temp file and an outputstream
encoding = doks.config.Config.getConfig().getProperty("importexport.encoding"); //for default encoding as set in doks.properties, otherwise set it fixed to "UTF-8" or ...
tempFile = File.createTempFile("myTempFile", ".txt");
osw = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(tempFile), encoding));
//write to the outputstream
while(...){
...
osw.write("This is my text");
...
}
//close the outputstream
osw.flush();
osw.close();
//add the file to a record
fileName = "myFinalFile.txt";
recordDto.documents.removeFileByName(fileName);
recordDto.documents.addFile(tempFile, fileName, null);
recordFacade.updateRecord(recordDto);
tempFile.delete();
A file on the local file system:
exportFile= new java.io.File("export.txt");
exportFile.createNewFile();
osw = new java.io.OutputStreamWriter(new java.io.FileOutputStream(exportFile));
//write to the outputstream
while(...){
...
osw.write("This is my text");
...
}
//close the outputstream
osw.flush();
osw.close();
For a complete script (see the examples below) you would:
- set a path and record description to an existing record of type Document where the exported file will be attached to. This is the same as for import scripts.
- Then you would create a query for a list of records whose metadata you want to put in the file. See The_DoKS_Scripting_API#Queries
Complete example scripts
Example 1: Export metadata and files to a directory
This script exports thesisnumber, title, language, degree, department, option and author to a tab separated text file.
It takes year and department code as arguments.
It contains examples of creating files and directories on the file system.
args=params.get(Constants.ARGS);
if (args==null || args.length < 3 || args[0].trim().length()==0 || args[1].trim().length()==0)
throw new Exception("you must specify the year and the (first letters of the) department code and the file");
exportFile= new java.io.File(args[2].trim()+'/'+"export.txt");
exportFile.createNewFile();
osw = new java.io.OutputStreamWriter(new java.io.FileOutputStream(exportFile));
osw.write("Thesisnummer\tTitel\tTaal\tDiploma\tDepartement\tRichting\tAutheurs\n");
filter= "volumes > -1";
theses= recordFacade.findRecords("KHK_ETD", filter);
iter= theses.iterator();
count= 0;
while (iter.hasNext()){
etdId = iter.next().id;
etd = recordFacade.getRecordDTO(etdId);
osw.write(etd.thesisNumber + "\t");
osw.write(etd.description + "\t");
langIter= etd.language.iterator();
while (langIter.hasNext()){
osw.write(langIter.next());
osw.write(";");
}
osw.write("\t");
osw.write(etd.degreeName + "\t");
osw.write(etd.department + "\t");
osw.write(etd.discipline + "\t");
authorIter= etd.authors.iterator();
while (authorIter.hasNext()){
osw.write(authorIter.next().description);
osw.write("\t");
}
osw.write("\n");
etdDir= new java.io.File(args[2].trim()+'/'+etd.id);
etdDir.mkdir();
abstractFile= new java.io.File(etdDir, "samenvatting.txt");
abstractFile.createNewFile();
abstractWriter = new java.io.OutputStreamWriter(new java.io.FileOutputStream(abstractFile));
abstractWriter.write(etd.abstract);
abstractWriter.close();
fileIter= etd.full_text.getFileIterator();
while (fileIter.hasNext()){
srcFile= fileIter.next().getFile();
dstFile= new java.io.File(etdDir, srcFile.name);
doks.util.FileCopy.bufferedCopy(srcFile, dstFile);
}
count++;
}
osw.close();
Example 2: MARC Export
This script creates a text file with MARC records for each ETD.
/*
Before running the script create a Document "MARC export xxxx" (xxxx=the year) in the Administration folder!
*/
import doks.util.StringUtils;
getTitleIndicator2(description) {
words = description.split("\\s");
if (stopWords.contains(words[0].toLowerCase())) {
return (char) (words[0].length() + 49); //49 is ascii '1' (add 1 for the space after the word)
}
return '0';
}
getAuthorIndicator1(author) {
lastName = author.substring(0, author.indexOf(','));
if (lastName.indexOf(' ') > 0) {
//this is a composed last name like "De Pooter"
return '2';
}
return '1';
}
try {
args = params.get(Constants.ARGS);
// the user must enter the year in "Arg 0"
if ((args == null) || (args.length < 1) || (args[0].trim().length() == 0)) {
throw new Exception("you must specify the year");
}
year = args[0].trim();
// the folder where we will look for the student information
administrationPath = "/Home/Administration";
// the name of the record with the student information (contains the year)
recordDescription = "MARC export " + year;
// the file to look for
fileName = "eindwerken.mrk8";
encoding = doks.config.Config.getConfig().getProperty("importexport.encoding");
stopWords = new HashSet();
stopWords.add("de");
stopWords.add("het");
stopWords.add("een");
stopWords.add("un");
stopWords.add("une");
stopWords.add("le");
stopWords.add("les");
stopWords.add("la");
stopWords.add("the");
stopWords.add("a");
stopWords.add("an");
// look up the folders
adminFolder = folderFacade.getFolderDTOByPath(administrationPath);
recordIter = folderFacade.getRecordLinks(adminFolder.id)
.iterator();
adminRecordDto = null;
while (recordIter.hasNext()) {
link = recordIter.next();
if (recordDescription.equals(link.description)) {
adminRecordDto = recordFacade.getRecordDTO(link.id);
}
}
if (adminRecordDto == null) {
throw new Exception("could not find the record with description: " +
recordDescription);
}
tempFile = File.createTempFile("marcexport", ".mrk8");
osw = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(tempFile), encoding));
filter = "(record.status='FULLTEXT_AVAILABLE' OR record.status='PUBLISHED') AND record.publicationDate=" +
year;
iter = recordFacade.findRecords(Constants.LOCALSTRUCTURE, filter).iterator();
count = 0;
while (iter.hasNext()) {
etd = recordFacade.getRecordDTO(iter.next().id);
if ("KMO".equals(etd.code))continue;
osw.write("=LDR 00718nam a2200145 4500\n");
osw.write("=008 " + year.substring(2) + "0901s" + year +
"\\\\\\\\bel\\\\\\\\\\\\\\\\\\\\\\000\\0\\dut\\d\n");
//authors
Iterator authorIter = etd.authors.iterator();
if (!authorIter.hasNext()) {
throw new Exception("etd: " + etd.thesisNumber +
" has no authors");
}
String author = ((RecordLink) authorIter.next()).description;
osw.write("=100 " + getAuthorIndicator1(author) + "\\$a" + author +
"$4aut.\n");
while (authorIter.hasNext()) {
author = ((RecordLink) authorIter.next()).description;
osw.write("=700 " + getAuthorIndicator1(author) + "\\$a" + author +
"$4aut.\n");
}
//title
String title = etd.description;
if (!StringUtils.isEmpty(etd.subTitle)) {
title += (" : " + etd.subTitle);
}
osw.write("=245 1" + getTitleIndicator2(title) + "$a" +
title + "\n");
osw.write("=260 \\\\$aGeel$bKatholieke Hogeschool Kempen. Campus HIK$c" +
year + "\n");
osw.write("=300 \\\\$a" + etd.collation + "\n");
osw.write("=502 \\\\$aEindwerk KHK, Campus HIK Geel. " +
etd.department + ". " + etd.degreeName +
((etd.discipline==null || etd.discipline.trim()=="")?" , ":(". " + etd.discipline + " , ")) + "academiejaar " +
(Integer.parseInt(year) - 1) + "-" + year + "\n");
osw.write("=520 0\\$a" + ((etd.abstract!=null)?etd.abstract.replaceAll("\\s+", " "):"") + "\n");
osw.write("=856 4\\$uhttp://doks.khk.be/eindwerk/do/record/Get?dispatch=view&recordId=" +
etd.id + "\n");
lastPartOfthesisNumber = String.valueOf(etd.thesisNumber)
.substring(4);
if ("Departement Sociaal Werk".equals(etd.department))
code= "PM-MA";
else
code= etd.code;
osw.write("=945 \\\\$b" + code + "-" + etd.thesisNumber +
"$dAP$mAMONO$lHIK$aA" + year + "80" + lastPartOfthesisNumber +
"$oHIK$n" + year + "80" + lastPartOfthesisNumber + "\n");
osw.write("=945 \\\\$bL" + code + "-" + etd.thesisNumber +
"$dAP$mAREF$lHIK$aA" + year + "90" + lastPartOfthesisNumber +
"$oHIK$n" + year + "90" + lastPartOfthesisNumber + "\n");
osw.write('\n');
}
osw.flush();
osw.close();
//add the file to the admin record
adminRecordDto.documents.removeFileByName(fileName);
adminRecordDto.documents.addFile(tempFile, fileName, null);
recordFacade.updateRecord(adminRecordDto);
tempFile.delete();
//show the admin record to the user
params.put(Constants.RECORD_DTO, adminRecordDto);
params.put(Constants.FORWARD, Constants.RECORD_VIEW);
} catch (e) {
throw e;
}
Example 3: Basic abstract Export
This script creates a text file with information for a basic abstract book. It exports department, degreename, title, abstract and authors. Use the year (arg 0) as filter. Create a new document in the administration folder named "abstracts xxxx" (xxxx=the year) to which the exported file can be appended. If something is not correct here, you'll get the error "could not find the record with description ....". Open the text file with the exported data in Excel and save it as an xls document. Use the file basicAbstractListTemplate.doc that can be found on http://doks.khk.be/do/record/Get?dispatch=view&recordId=SDoc413ebf1713bfd5d10115d153fbe3000b as template to merge with the excel file. Now open the merge toolbar (view - toolbars - mail merge), use the buttons in this toolbar to make a connection with the excel file and click on the "merge in a new document"- button. You can use this script to start from to create other export scripts. The script explains itself: add more fields from your localstructure if you want to and then edit the template file to use this additional data to be merged.
*/
args=params.get(Constants.ARGS);
if (args==null || args.length < 1 || args[0].trim().length()==0)
throw new Exception("you must specify the year");
administrationPath = "/Home/Administration";
recordDescription = "abstracts " + args[0].trim();
fileName = "abstracts_" + args[0].trim();
dateFormat = new java.text.SimpleDateFormat("d-M-yyyy");
encoding = doks.config.Config.getConfig().getProperty("importexport.encoding");
//get the student names from the authors in the form "firstname lastname", sorted by lastname
getNames(etd){
names="";
count=0;
sortedAuthors= new TreeSet();
iter= etd.authors.iterator();
while (iter.hasNext()){
sortedAuthors.add(iter.next().description);
}
iter= sortedAuthors.iterator();
while (iter.hasNext()){
author= iter.next();
//authors are stored in the form "lastname, firstname" -> conversion needed
commabPos= author.indexOf(",");
lastName= author.substring(0, commabPos);
lastName= lastName.replaceAll(" ", " ");
firstName= author.substring(commabPos + 2);
names+= firstName + " " + lastName + "\t";
count++;
}
while(count<10){
names+="\t";
count++;
}
return names;
}
//search the record to which we will add the exported file
adminFolder = folderFacade.getFolderDTOByPath(administrationPath);
recordIter = folderFacade.getRecordLinks(adminFolder.id).iterator();
adminRecordDto = null;
while (recordIter.hasNext()) {
link = recordIter.next();
if (recordDescription.equals(link.description)) {
adminRecordDto = recordFacade.getRecordDTO(link.id);
}
}
if (adminRecordDto==null)
throw new Exception("could not find the record with description: " + recordDescription);
//if titels_XX_1.txt exists then create titels_XX_2.txt and so on
fileExists= true;
fileCounter= 1;
while (fileExists){
newFileName= fileName+ "_" + fileCounter + ".txt";
if (adminRecordDto.documents.getFileByName(newFileName)==null)
fileExists= false;
else
fileCounter++;
}
//create a temp file where we can write a line per thesis
tempFile= File.createTempFile("titels", ".txt");
osw = new OutputStreamWriter(new FileOutputStream(tempFile), encoding);
osw.write("Departement\tDiploma\tTitel\tSamenvatting\t");
osw.write("auteur\tNaam2\tNaam3\tNaam4\tNaam5\tNaam6\tNaam7\tNaam8\tNaam9\tNaam10\n");
filter= "record.publicationDate="+ args[0].trim();
theses= recordFacade.findRecords(Constants.LOCALSTRUCTURE, filter);
iter= theses.iterator();
count= 0;
while (iter.hasNext()){
etdId = iter.next().id;
etd = recordFacade.getRecordDTO(etdId);
osw.write(etd.department + "\t");
osw.write(etd.degreeName + "\t");
osw.write(etd.description + "\t");
if (etd.abstract!=null){
osw.write(etd.abstract.replaceAll("\t", " ").replaceAll("\r\n|\r|\n", "<br>") + "\t");
}
osw.write(getNames(etd));
osw.write("\n");
count++;
}
osw.close();
//add the file to the admin record
adminRecordDto.documents.addFile(tempFile, newFileName, storageFacade.getFileFormatDTOByName("Text").toLink());
recordFacade.updateRecord(adminRecordDto);
tempFile.delete();
//show the admin record to the user
params.put(Constants.RECORD_DTO, adminRecordDto);
params.put(Constants.FORWARD, Constants.RECORD_VIEW);
Running scripts from the command line
You can use the doks.data.ScriptRunner class to run scripts from the command line (if your CLASSPATH is set correctly and the script doesn't need any variables from a browser session).
doks.data.ScriptRunner <scriptName> [<arg0>] [<arg1>] ...
Example:
java doks.data.ScriptRunner listFileFormats 2006
In this way running certain scripts could be scheduled or automated with a cron job or the windows scheduler.
