Commit 227538f9 authored by Ubbo Veentjer's avatar Ubbo Veentjer
Browse files

add markdown renderer and index/help markdown text

parent d154dbb5
......@@ -25,6 +25,7 @@ dependencies {
implementation 'org.apache.cxf:cxf-rt-rs-client:3.3.4'
implementation 'io.jsonwebtoken:jjwt:0.9.1'
implementation 'com.fasterxml.jackson.jaxrs:jackson-jaxrs-json-provider'
implementation 'com.atlassian.commonmark:commonmark:0.14.0'
implementation 'info.textgrid.middleware.clients:textgrid-clients:3.4.1-SNAPSHOT'
testImplementation('org.springframework.boot:spring-boot-starter-test') {
exclude group: 'org.junit.vintage', module: 'junit-vintage-engine'
......
package info.textgrid.rep.markdown;
import java.util.HashMap;
import java.util.Locale;
import org.springframework.stereotype.Controller;
import org.springframework.ui.Model;
import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.PathVariable;
import info.textgrid.rep.shared.I18NUtils;
@Controller
public class DocController {
@GetMapping("/")
public String renderIndex(
Model model,
Locale locale
) {
return renderMarkdown(model, "index", locale);
}
@GetMapping("/doc/{doc}")
public String renderMarkdown(
Model model,
@PathVariable("doc") String doc,
Locale locale) {
String dloc = "docs/" + doc + "." + locale.getLanguage() + ".md";
String content = MarkdownRenderer.renderDoc(dloc);
model.addAttribute("content", content);
// translation array
HashMap<String, String> i18n = I18NUtils.i18n(locale);
model.addAttribute("i18n", i18n);
model.addAttribute("language", locale.getLanguage());
return "markdown";
}
}
package info.textgrid.rep.markdown;
import java.io.IOException;
import java.net.URISyntaxException;
import java.net.URL;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import org.commonmark.node.Node;
import org.commonmark.parser.Parser;
import org.commonmark.renderer.html.HtmlRenderer;
public class MarkdownRenderer {
public static String renderDoc(String path) {
URL templateUrl = MarkdownRenderer.class.getClassLoader().getResource(path);
Path mdPath = null;
try {
mdPath = Paths.get(templateUrl.toURI());
} catch (URISyntaxException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return renderMarkdown(mdPath);
}
public static String renderMarkdown(Path path) {
String markdown = null;
try {
markdown = new String(Files.readAllBytes(path));
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
Parser parser = Parser.builder().build();
Node document = parser.parse(markdown);
HtmlRenderer renderer = HtmlRenderer.builder().build();
return renderer.render(document);
}
}
Das TextGrid Repository, ein Langzeitarchiv für geisteswissenschaftliche Forschungsdaten, liefert einen umfangreichen,
durchsuch- und nachnutzbaren Bestand XML/TEI-kodierter Texte, Bilder und Datenbanken. Zum stetig wachsenden Bestand
zählen mit der [Digitalen Bibliothek von TextGrid](https://textgrid.de/digitale-bibliothek) heute z.B. Werke von rund 600 Autorinnen und Autoren deutschsprachiger
Belletristik (Prosa, Lyrik, Dramen) und Sachliteratur von den Anfängen des Buchdrucks bis zum frühen 20. Jahrhundert,
die in verschiedenen Ausgabeformaten (z.B. XML, ePub, PDF) gespeichert, publiziert und durchsucht werden können.
Mit verschiedenen Werkzeugen, wie etwa zur Bildbetrachtung oder zur quantitativen Textanalyse, können die Texte
weiter erforscht und visualisiert werden.
Das TextGrid Repository ist Teil der Virtuellen Forschungsumgebung [TextGrid](https://textgrid.de/), die neben dem fachwissenschaftlichen
Langzeitarchiv eine Open Source-Software für die kollaborative Erstellung und Publikation z.B. digitaler Editionen
auf XML/TEI-Basis anbietet.
Durch Eingabe eines Suchwortes kann direkt im Bestand gesucht werden; alternativ kann dieser über "Explore"
nach vordefinierten Kategorien (z.B. "Autor", "Genre") angezeigt und aufgerufen werden.
Die Suche unterstützt die Anfragesprache [Lucene](https://lucene.apache.org/core/); neben der Freitextsuche ermöglicht sie u.a. Anfragen nach
folgendem Muster:
[edition.agent.value:goethe AND pudel](/search?query=edition.agent.value%3Agoethe+AND+pudel)
Weitere Informationen zur Suchsyntax, zu Suchkategorien und -filtern finden Sie in der [Hilfe](/doc/syntax).
**Mitmachen**
Möchten Sie eigenes XML-erschlossenes Material im TextGrid Repository zitierfähig archivieren und zugänglich machen?
Nehmen Sie Kontakt mit uns auf: https://textgrid.de/kontakt/
# Mission Statement
The TextGrid Repository (TextGridRep) is a digital preservation archive for human sciences research data providing a variety of data for teaching and research purposes. It promotes open access to research data including open standards allowing an efficient reuse for research. The TextGridRep also provides researchers with a comprehensive and reliable service to store their data permanently, well described and with a stable reference for citation and reuse.
The TextGridRep is part of the [TextGrid](https://textgrid.de/en/) Virtual Research Environment (VRE), which offers besides digital preservation also open-source software for collaborative creation, analysis and publication of text and images. The TextGrid VRE is optimised for XML/TEI formats and editorial publication out of the TextGrid Laboratory (TextGridLab). An independent publication from the TextGridLab including other types of data and formats is equally possible by tools using the TextGridRep API such as TG-import.
The TextGrid Repository is a community orientated result of a national program to establish a Digital Humanities infrastructure in Germany and operates together with the DARIAH-DE Repository as part of the [Humanities Data Center](https://humanities-data-centre.de/) (HDC).
The mission of the TextGridRep is to serve national and international research, teaching and learning by providing long term preservation, further processing, openly sharing and dissemination of digital research data according to ethical and scientific standards of the international research community.
The repository's mission is in line with the [Open Access strategy of the University of Göttingen](https://www.uni-goettingen.de/en/221506.html) and its [research data policy](http://www.uni-goettingen.de/en/488918.html). It provides all necessary resources to promote and support making the research results of its researchers as widely accessible and usable as possible. This commitment to open access is reflected in the organisational and technical infrastructure as well as in its archiving procedures of the repository to allow the use of publications and data without any access restriction in order “to support research and innovation in science […] and society in a direct and lasting way”.
In terms of [data management](https://wiki.de.dariah.eu/display/TextGrid/Digital+Object+Management), publication and preservation workflows are based on the Open Archiving Information System, see [TextGrid Repository – Digital Object Management](https://wiki.de.dariah.eu/display/TextGrid/Digital+Object+Management#DigitalObjectManagement-TextGridandtheOpenArchivalInformationSystem(OAIS)).
The commitment is strongly supported by the two relevant institutions ensuring also the long-term sustainability of the repository and its data: The [Göttingen State and University Library](https://www.sub.uni-goettingen.de/en/about-us/portrait/) (SUB) and the [Gesellschaft für wissenschaftliche Datenverarbeitung Göttingen mbH](https://www.gwdg.de/about-us) (GWDG).
Both institutions share a commitment to the sustainability of services and to [FAIR principles](https://www.go-fair.org/fair-principles/) in research and its infrastructures. For the SUB research data management is an important aspect of the [strategic aims of Göttingen State and University Library](https://www.sub.uni-goettingen.de/en/about-us/portrait/strategy/#c13124). Not only for research data, but for all digital resources, Göttingen State and University Library follows a [policy](https://www.sub.uni-goettingen.de/en/about-us/portrait/goettingen-state-and-university-library-digital-policies-guiding-principles/), which contains guiding principles in order to ensure the quality for access, metadata and IT architecture.
In the context of open access, the Göttingen State and University Library also participates in national and international projects, such as the [Confederation of Open Access Repositories](https://www.coar-repositories.org/) (COAR) and [OpenAIRE](http://www.openaire.eu/). In this perspective the TextGrid Repository is also in line with open access requirements of important funders of the German research system as the German Research Foundation (DFG) (see <https://www.dfg.de/formulare/2_00/v/dfg_2_00_de_v1215.pdf>, p. 44, section 12.2.1) and the European Union. Mandates of the European Commission and the European Research Council require as stated e.g. in the European Open Access Pilot on Open Data all funded projects to publish their results in Open Access (see the [Horizon 2020 Online Manual](https://ec.europa.eu/research/participants/docs/h2020-funding-guide/cross-cutting-issues/open-access-data-management/open-access_en.htm)). The Research Department at Göttingen University offers detailed information about the [European Union Open Access Pilot](https://www.uni-goettingen.de/en/487290.html) also on its web pages.
# Corpus and Digital Library of TextGrid
The TextGrid Repository offers an extensive searchable and adaptable corpus of XML/TEI encoded texts and images. Amongst the continuously growing corpus is the [Digital Library of TextGrid](https://textgrid.de/en/digitale-bibliothek), which consists of works of more than 600 authors of German fiction (prose, verse and drama), as well as nonfiction from the beginning of the printing press to the early 20th century. The files are saved in different output formats (XML, ePub, PDF), published and made searchable. Different tools e.g. viewing or quantitative text-analysis tools can be used for visualization or to further research the text.
You can search within the corpus by entering a search term; alternatively “Explore” will lead you to predefined categories (e.g. “author”, “genre”).
The search function supports the query language [Lucene](https://lucene.apache.org/core/); in addition to the free text search it allows queries with the following pattern:
[edition.agent.value:goethe AND pudel](/search?query=edition.agent.value%3Agoethe+AND+pudel)
More information on search syntax, search categories and filters are covered in the [Help](/doc/syntax) section.
# Citation recommendation
TextGrid Consortium. 2006–2014. TextGrid: A Virtual Research Environment for the Humanities. Göttingen: TextGrid Consortium. textgrid.de.
# Participation
Would you like your own XML encoded files to be archived, made quotable and accessible through the TextGrid Repository? Then contact us: <https://textgrid.de/en/kontakt/>
# deutsche hilfe
hier dir hilfetexte
\ No newline at end of file
# Searching in the TextGrid Repository
## The syntax of Apache Lucene
The TextGrid Repository uses the syntax of Apache Lucene 2.9.4. The mechanism of the syntax will be explained in the following paragraphs. The demonstration largely follows the summary on the [website of Apache Lucene](https://lucene.apache.org/core/5_1_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description).
Suchanfragen werden in dieser Syntax mit Suchbegriffen und sogenannten Operatoren gebildet. Suchbegriffe können über Felder näher bestimmt werden.
Search terms and search fields
It is possible to search for single words as well as phrases that are shown to belong together via quotation marks, e.g. “TextGrid Repository”. Search terms can be limited to certain fields:
field-name:search-term
or
field-name:"multipart phrase"
These are the different fields of TextGrid:
* “title” for the title of the work
* „edition.agent.value“ for the author
* „language“ for the language of the work
* „notes” for notes of the text
* „genre“ for the genre
* „rightsHolder“ for the rights holder of the digital version of the text
* „work.dateOfCreation.date“ , „work.dateOfCreation.notBefore“ and „work.dateOfCreation.noAfter“ for dates of the work
The “Advanced Search” offers the possibility to choose the fields to search in the meta data directly and to connect them with operators for search queries.
Search queries can be altered in different ways. There are place holders, options for a vague search, specifying distances between words, searching in a defined range and appointing different relevance scales to search terms.
* **Place holders:** For single words ? replaces one character, and * stands for any number of characters. E.g. `Text?rid` or `*xtgrid`.
* **Vague search:** Adding a ~ to the word results in a vagueness of the search according to the Levenshtein distance. Following the ~ can be a value between 0 and 1. The closer the value is to 1, the higher the demanded resemblance. The standard value is 0.5.
* **Distances:** When searching for phrases, adding a ~ and a number after the phrase specifies the distance between the single words within the phrase. E.g. `"TextGrid Repository"~10`. The number stands for how many words can lie between the words. The “Advanced Search” gives the option to directly enter the number in the searching mask.
* **Ranges:** When connecting two search values with a “TO”, all values between them are found within the field. This applies to numerical values as well as words. For words the alphabetical order counts. Searches including the given search values are written within [], while searches excluding them are written within {}. E.g. `edition.agent.value:[Aristophanes TO Zuckmayer]` searches for all author names between “Aristophanes” and “Zuckmayer” including those names.
* **Relevance:** By adding a ^ and a number after a search term or phrase, they can be marked as more relevant, e.g. `TextGrid^5 Repository`. The standard value is 1.
Some characters must be masked with a \ : `+ - && || ! ( ) { } [ ] ^ " ~ * ? : \.`
Operators
Lucene uses Logical connectives to combine search terms and phrases. The standard value is OR, which is equal to ||. Logical connectives must be written in capital letters.
* **AND (equal to &&)**:Texts containing all of the search terms are found
* **+:** The following search term must be contained in the text
* **NOT (equal to ! or -):**The following search term must not be in the text. Using this at the beginning of the search query can slow down the searching process.
Lucene supports bracketing for the combination of logical connectives, e.g. `TextGrid AND (Laboratory OR Repository)` finds all texts that contain the word “TextGrid”, as well as the word “Laboratory” or “Repository”. This mechanism can be used with fields as well.
......@@ -117,4 +117,5 @@ search-term=Suchbegriff
settings=Einstellungen
shelf=Regal
imprint=Impressum
privpol=Datenschutzerklrung
\ No newline at end of file
privpol=Datenschutzerklrung
help=Hilfe
\ No newline at end of file
......@@ -118,4 +118,4 @@ settings=Settings
shelf=Shelf
imprint=Imprint
privpol=Privacy Policy
help=Help
......@@ -53,11 +53,11 @@
<li class="tg nav_item -has-dropdown" id="nav-explore" role="presentation">
<a aria-labelledby="nav-explore" href="/browse" class="tg dropdown_toggle -nav" aria-haspopup="true" role="menuitem">
Hilfe
${i18n['help']}
</a>
<ul class="tg dropdown_menu -nav" role="menu">
<li class="" id="layout_18" role="presentation">
<a aria-labelledby="layout_18" href="/repository" role="menuitem" tabindex="">Syntax</a>
<a aria-labelledby="layout_18" href="/doc/syntax" role="menuitem" tabindex="">${i18n['search']}</a>
</li>
</ul>
</li>
......
<%@ page contentType="text/html" pageEncoding="UTF-8" %>
<%@ taglib prefix="c" uri="http://java.sun.com/jsp/jstl/core" %>
<%@ taglib uri="http://textgrid.info/rep/utils" prefix="utils" %>
<%@ taglib uri="http://java.sun.com/jsp/jstl/functions" prefix="fn" %>
<%@ include file="base/head.jsp" %>
<div class="journal-content-article">
${content}
</div>
<%@ include file="base/foot.jsp" %>
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment