Suggestions for optimization

Configuration - added `confluenceCreateSubpages` parameter to be able to switch off creation of subpages - changed `input` parameter into a list of files - one can now also configure a parent page by file asciidoc2confluence - added `scriptBasePath` variable which now MUST be set (couldn't find a better way to get the path to the Config.groovy file. Suggestions welcome!) - adopted new config parameters - added more log output - added check whether 'ancestorId' is provided, if not a new master page will be created README - switched from Markdown to Asciidoc - added section about usage with Maven and GMavenPlus
rdmueller · May 5, 2015 · b9114ac · b9114ac
1 parent 659f307
commit b9114ac
Show file tree

Hide file tree

Showing 4 changed files with 180 additions and 55 deletions.
diff --git a/Config.groovy b/Config.groovy
@@ -1,21 +1,32 @@
 
-// the asciidoc generated html file to be exported
-    input = "asciidocOutput.html"
+// 'input' is an array of files to upload to Confluence with the ability 
+//          to configure a different parent page for each file.
+//
+// Attributes
+// - 'file': absolute or relative path to the asciidoc generated html file to be exported
+// - 'ancestorId': the id of the parent page in Confluence; leave this empty if a new parent shall be created in the space
+input = [
+    	[ file: "asciidocOutput1.html", ancestorId: '' ],
+    	[ file: "asciidocOutput2.html", ancestorId: 123456 ]
+	]
 
 // endpoint of the confluenceAPI (REST) to be used
-    confluenceAPI = 'https://[yourServer]/[context]/rest/api/'
+confluenceAPI = 'https://[yourServer]/[context]/rest/api/'
 
 // the key of the confluence space to write to
-    confluenceSpaceKey = 'asciidoc'
+confluenceSpaceKey = 'asciidoc'
+
+// variable to determine whether ".sect2" sections shall be split from the current page into subpages
+confluenceCreateSubpages = false
 
 // the pagePrefix will be a prefix for each page title
 // use this if you only have access to one confluence space but need to store several
 // pages with the same title - a different pagePrefix will make them unique
-    confluencePagePrefix = ''
+confluencePagePrefix = ''
 
 // username:password of an account which has the right permissions to create and edit
 // confluence pages in the given space.
 // if you want to store it securely, fetch it from some external storage.
 // you might even want to prompt the user for the password
-    confluenceCredentials = 'user:password'.bytes.encodeBase64().toString()
+confluenceCredentials = 'user:password'.bytes.encodeBase64().toString()
 
diff --git a/README.adoc b/README.adoc
@@ -0,0 +1,97 @@
+= asciidoc2confluence
+:toc:
+:includedir: ./
+:source-highlighter: coderay
+
+This is a groovy script to import HTML files generated by http://asciidoctor.org/[asciidoctor] to one or multiple https://www.atlassian.com/software/confluence/[Confluence] pages.
+
+The easiest way to get this up and running is to modify the `Config.groovy` to fit your environment and load the main script into the groovyConsole. You then need some HTML output from http://asciidoctor.org/[asciidoctor] (The https://raw.githubusercontent.com/arc42/arc42.github.io/master/template/en/arc42-template.html[arc42 template] might be a good starting point). Please note that the script is completely focussed on Asciidoctor output as it makes assumptions about the HTML structure (e.g. "sect1" and "sect2" css classes being present).
+
+When you start the script the first time, it will try to split the html file into subsections and push them to your confluence instance. This is done to be able to handle large documentation pages and to be able to send update notifications for specific sections of a large document. This can be switched off though by the `confluenceCreateSubpages` configuration parameter.
+
+
+== Configuration
+
+[source,groovy]
+.Config.groovy
+----
+include::Config.groovy[]
+----
+
+== Running the script
+
+The script can be run directly, via Maven or Gradle. It requires Java >= 7 and the `scriptBasePath` variable being set which points to the folder where to find the `Config.groovy` file.
+
+=== Usage with Maven
+
+The following `pom.xml` sample shows how to use the `asciidoc2confluence.groovy` script with your Maven build. It will run when you execute the `mvn gplus:execute` goal.
+
+[source,xml,linenums]
+----
+<?xml version="1.0" encoding="UTF-8"?>
+<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
+  <modelVersion>4.0.0</modelVersion>
+  <groupId>io.github.rdmueller</groupId>
+  <artifactId>asciidoc2confluence</artifactId>
+  <version>1.0.0-SNAPSHOT</version>
+  <packaging>jar</packaging>
+  <name>asciidoc2confluence sample</name>
+  <description>An asciidoc2confluence sample pom.xml</description>
+  <properties>
+    <!-- The following class will be used in the MANIFEST.MF -->
+    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
+    <java.version>1.8</java.version>
+    <java.target.version>1.8</java.target.version>
+    <java.source.version>1.8</java.source.version>
+  </properties>
+  <build>
+    <finalName>${project.artifactId}</finalName>
+    <plugins>
+      <plugin>
+        <groupId>org.apache.maven.plugins</groupId>
+        <artifactId>maven-compiler-plugin</artifactId>
+        <configuration>
+          <target>${java.target.version}</target>
+          <source>${java.source.version}</source>
+        </configuration>
+      </plugin>
+      <plugin>
+        <groupId>org.codehaus.gmavenplus</groupId>
+        <artifactId>gmavenplus-plugin</artifactId>
+        <version>1.5</version>
+        <executions>
+          <execution>
+            <goals>
+              <goal>execute</goal>
+            </goals>
+          </execution>
+        </executions>
+        <configuration>
+          <properties>
+            <property>
+              <name>scriptBasePath</name>
+              <value>${project.basedir}/src/main/groovy/</value>
+            </property>
+          </properties>
+          <scripts>
+            <script>file:///${project.basedir}/src/main/groovy/asciidoc2confluence.groovy</script>
+          </scripts>
+        </configuration>
+        <dependencies>
+          <dependency>
+            <groupId>org.codehaus.groovy</groupId>
+            <artifactId>groovy-all</artifactId>
+            <!-- any version of Groovy \>= 1.5.0 should work here -->
+            <version>2.4.3</version>
+            <scope>runtime</scope>
+          </dependency>
+        </dependencies>
+      </plugin>
+    </plugins>
+  </build>
+</project>
+----
+
+=== Usage with Gradle
+
+_to be done..._
diff --git a/README.md b/README.md
diff --git a/asciidoc2confluence.groovy b/asciidoc2confluence.groovy
@@ -35,14 +35,16 @@ import groovyx.net.http.ContentType
 import java.security.MessageDigest
 
 // configuration
-def config = new ConfigSlurper().parse(new File('Config.groovy').text)
-println config
+println "scriptBasePath: ${scriptBasePath}"
+def config = new ConfigSlurper().parse(new File(scriptBasePath, 'Config.groovy').text)
+println "Config: ${config}"
+
 // helper functions
 
 def MD5(String s) {
     MessageDigest.getInstance("MD5").digest( s.bytes).encodeHex().toString()
 } 
- 
+
 // for getting better error message from the REST-API
 void trythis (Closure action) {
     try {
@@ -58,26 +60,35 @@ void trythis (Closure action) {
 def pushToConfluence = { pageTitle, pageBody, parentId ->
     def api = new RESTClient(config.confluenceAPI)
     def headers = [
-                    'Authorization': 'Basic ' + config.confluenceCredentials,
-                    'Content-Type':'application/json; charset=utf-8'
-                  ]
+            'Authorization': 'Basic ' + config.confluenceCredentials,
+            'Content-Type':'application/json; charset=utf-8'
+    ]
     //this fixes the encoding
-    api.encoderRegistry = new EncoderRegistry( charset: 'utf-8' )                   
+    api.encoderRegistry = new EncoderRegistry( charset: 'utf-8' )
     //try to get an existing page
     def page
     localPage = pageBody.toString().trim()
     //modify local page in order to match the internal confluence storage representation a bit better
     //definition lists are not displayed by confluence, so turn them into tables
     localPage = localPage.replaceAll('<dl>','<table>')
+<<<<<<< HEAD
                          .replaceAll('</dl>','</table>')
                          .replaceAll('<dt[^>]*>','<tr><th>')
                          .replaceAll('</dt>','</th>')
                          .replaceAll('<dd>','<td>')
                          .replaceAll('</dd>','</td></tr>')
     def localHash = MD5(localPage)                     
+=======
+            .replaceAll('</dl>','</table>')
+            .replaceAll('<dt[^>]*>','<tr><th>')
+            .replaceAll('</dt>','</th>')
+            .replaceAll('<dd>','<td>')
+            .replaceAll('</dd>','</td></tr>')
+    def localHash = generateMD5(localPage)
+>>>>>>> Suggestions for optimization
     localPage += '<p><ac:structured-macro ac:name="children"/></p>'
     localPage += '<p style="display:none">hash: #'+localHash+'#</p>'
-                         
+
     def request = [
             type : 'page',
             title: config.confluencePagePrefix + pageTitle,
@@ -97,12 +108,12 @@ def pushToConfluence = { pageTitle, pageBody, parentId ->
         ]
     }
     trythis {
-        page = api.get(path: 'content', 
-                       query: [
-                            'spaceKey': config.confluenceSpaceKey,
-                            'title'   : config.confluencePagePrefix + pageTitle,
-                            'expand'  : 'body.storage,version'
-                       ], headers: headers).data.results[0]
+        page = api.get(path: 'content',
+                query: [
+                        'spaceKey': config.confluenceSpaceKey,
+                        'title'   : config.confluencePagePrefix + pageTitle,
+                        'expand'  : 'body.storage,version'
+                ], headers: headers).data.results[0]
     }
     if (page) {
         println "found existing page: " + page.id +" version "+page.version.number
@@ -113,7 +124,7 @@ def pushToConfluence = { pageTitle, pageBody, parentId ->
 
         def remoteHash = remotePage =~ /(?ms)hash: #([^#]+)#/
         remoteHash = remoteHash.size()==0?"":remoteHash[0][1]
-                
+
         if (remoteHash == localHash) {
             println "page hasn't changed!"
         } else {
@@ -123,8 +134,8 @@ def pushToConfluence = { pageTitle, pageBody, parentId ->
                 request.id      = page.id
                 request.version = [number: (page.version.number as Integer) + 1]
                 def res = api.put(contentType: ContentType.JSON,
-                                  requestContentType : ContentType.JSON,
-                                  path: 'content/' + page.id, body: request, headers: headers)
+                        requestContentType : ContentType.JSON,
+                        path: 'content/' + page.id, body: request, headers: headers)
             }
             println "updated page"
             return page.id
@@ -133,38 +144,51 @@ def pushToConfluence = { pageTitle, pageBody, parentId ->
         //create a page
         trythis {
             page = api.post(contentType: ContentType.JSON,
-                            requestContentType : ContentType.JSON,
-                            path: 'content', body: request, headers: headers)
+                    requestContentType : ContentType.JSON,
+                    path: 'content', body: request, headers: headers)
         }
         println "created page "+page?.data?.id
         return page?.data?.id
     }
 }
-def html = new File(config.input).getText('utf-8')
-def dom = Jsoup.parse(html,'utf-8')
-// <div class="sect1"> are the main headings
-// let's extract these and push them to confluence
-def masterid = pushToConfluence "Main Page", "this shall be the main page under which all other pages are created", null
 
-dom.select('div.sect1').each { sect1 ->
-    def pageTitle = sect1.select('h2').text()
-    def pageBody = sect1.select('div.sectionbody')
-    def subPages = []
-    pageBody.select('div.sect2').each { sect2 ->
-        def title = sect2.select('h3').text()
-        sect2.select('h3').remove()
-        def body = sect2
-        subPages << [
-                title: title,
-                body: body
-                ]
+config.input.each { input ->
+
+    def html = new File(input.file).getText('utf-8')
+    def dom = Jsoup.parse(html,'utf-8')
+    def masterid = input.ancestorId
+
+    // if confluenceAncestorId is not set, create a new parent page
+    if (!input.ancestorId) {
+        masterid = pushToConfluence "Main Page", "this shall be the main page under which all other pages are created", null
+        log.info("New master page created with id ${masterid}")
     }
-    pageBody.select('div.sect2').remove()
-    println pageTitle
-    def thisSection = pushToConfluence pageTitle, pageBody, masterid
-    subPages.each { subPage ->
-        println "   "+subPage.title
-        pushToConfluence subPage.title, subPage.body, thisSection
+
+    // <div class="sect1"> are the main headings
+    // let's extract these and push them to confluence
+    dom.select('div.sect1').each { sect1 ->
+        def pageTitle = sect1.select('h2').text()
+        def pageBody = sect1.select('div.sectionbody')
+        def subPages = []
+
+        if (config.confluenceCreateSubpages) {
+            pageBody.select('div.sect2').each { sect2 ->
+                def title = sect2.select('h3').text()
+                sect2.select('h3').remove()
+                def body = sect2
+                subPages << [
+                        title: title,
+                        body: body
+                ]
+            }
+            pageBody.select('div.sect2').remove()
+        }
+        println pageTitle
+        def thisSection = pushToConfluence pageTitle, pageBody, masterid
+        subPages.each { subPage ->
+            println "   "+subPage.title
+            pushToConfluence subPage.title, subPage.body, thisSection
+        }
     }
 }
 ""