Arconia Docling
Arconia provides seamless integration with Docling, a powerful AI-powered document conversion service that transforms documents into structured formats like Markdown. The integration provides an auto-configured DoclingClient
that can be used in Spring Boot applications to interact with a Docling Server for converting various document formats including PDFs, Word documents, and web pages.
Quick Start
Let’s see how you can get started with Arconia Docling in your Spring Boot application.
Dependencies
To add Docling support to your Spring Boot application, include the Arconia Docling Spring Boot Starter dependency in your project.
-
Gradle
-
Maven
dependencies {
implementation 'io.arconia:arconia-docling-spring-boot-starter'
}
<dependency>
<groupId>io.arconia</groupId>
<artifactId>arconia-docling-spring-boot-starter</artifactId>
</dependency>
Arconia publishes a BOM (Bill of Materials) that you can use to manage the version of the Arconia libraries. While not required, it is recommended to use the BOM to ensure that all dependencies are compatible.
-
Gradle
-
Maven
dependencyManagement {
imports {
mavenBom "io.arconia:arconia-bom:0.15.0"
}
}
<dependencyManagement>
<dependencies>
<dependency>
<groupId>io.arconia</groupId>
<artifactId>arconia-bom</artifactId>
<version>0.15.0</version>
<type>pom</type>
<scope>import</scope>
</dependency>
</dependencies>
</dependencyManagement>
Dev Services
Arconia Dev Services provide zero-code integrations for services your application depends on, both at development and test time, relying on the power of Testcontainers and Spring Boot.
When working with Docling, you can use the Docling Dev Service to automatically start a Docling server during development and testing, giving you the possibility to convert documents without manually setting up a Docling server.
To enable the Docling Dev Service, add the following dependency to your project:
-
Gradle
-
Maven
dependencies {
testAndDevelopmentOnly "io.arconia:arconia-dev-services-docling"
}
<dependency>
<groupId>io.arconia</groupId>
<artifactId>arconia-dev-services-docling</artifactId>
<scope>runtime</scope>
<optional>true</optional>
</dependency>
By default, the Dev Service is configured to expose the Docling UI on a specific port. The application logs will show you the URL where you can access that.
... Docling UI: http://localhost:<port>/ui
Running the Application
When using the Arconia Dev Services, you can keep running your application as you normally would. The Dev Services will automatically start when you run your application.
-
CLI
-
Gradle
-
Maven
arconia dev
./gradlew bootRun
./mvnw spring-boot:run
Unlike the lower-level Testcontainers support in Spring Boot, Arconia doesn’t require special tasks to run your application when using Dev Services (./gradlew bootTestRun or ./mvnw spring-boot:test-run ) nor requires you to define a separate @SpringBootApplication class for configuring Testcontainers.
|
The application logs will show you the URL where you can access the Docling Server UI for interactive document conversion.
Configuration
The Arconia Docling integration provides sensible defaults for connecting to a Docling server. You can customize the connection settings and timeouts via configuration properties.
Property | Default | Description |
---|---|---|
|
The base URL for the Docling server API. |
|
|
|
Timeout to establish a connection to the Docling server. |
|
|
Timeout for receiving a response from the Docling server. |
Actuator
Health Indicator
When Spring Boot Actuator is present on the classpath, Arconia automatically configures a health indicator for the Docling integration. This health indicator checks the connectivity to the Docling server by calling its health endpoint. You can customize it via configuration properties.
Property | Default | Description |
---|---|---|
|
|
Whether the Docling health indicator should be enabled. |
When enabled, the health status will be included in the actuator /health
endpoint response, showing whether the Docling server is reachable and operational.
Using the Docling Client
Once you have added the dependency and optionally configured the connection settings, you can autowire and use the auto-configured DoclingClient
in your Spring components.
Basic Usage
@Component
public class DocumentService {
private final DoclingClient doclingClient;
public DocumentService(DoclingClient doclingClient) {
this.doclingClient = doclingClient;
}
public String convertWebPage(String url) {
ConvertDocumentRequest request = ConvertDocumentRequest.builder()
.addHttpSources(url)
.build();
ConvertDocumentResponse response = doclingClient.convertSource(request);
return response.document().markdownContent();
}
}
Converting HTTP Sources
You can convert web pages or documents accessible via HTTP/HTTPS URLs:
ConvertDocumentRequest request = ConvertDocumentRequest.builder()
.addHttpSources("https://example.com/document.pdf")
.build();
ConvertDocumentResponse response = doclingClient.convertSource(request);
String markdownContent = response.document().markdownContent();
String filename = response.document().filename();
Converting File Sources
You can also convert local files by encoding them as Base64:
byte[] fileContent = new ClassPathResource("document.pdf").getContentAsByteArray();
String base64Content = Base64.getEncoder().encodeToString(fileContent);
ConvertDocumentRequest request = ConvertDocumentRequest.builder()
.addFileSources("document.pdf", base64Content)
.build();
ConvertDocumentResponse response = doclingClient.convertSource(request);
String markdownContent = response.document().markdownContent();
Conversion Options
You can customize the conversion process using ConvertDocumentOptions
:
ConvertDocumentOptions options = ConvertDocumentOptions.builder()
.includeImages(true)
.doOcr(true)
.build();
ConvertDocumentRequest request = ConvertDocumentRequest.builder()
.addHttpSources("https://example.com/document.pdf")
.options(options)
.build();
ConvertDocumentResponse response = doclingClient.convertSource(request);
Error Handling
The DoclingClient
will throw appropriate runtime exceptions for different error conditions, as managed by the underlying RestClient
.
try {
ConvertDocumentRequest request = ConvertDocumentRequest.builder()
.addHttpSources("https://invalid-url.com/document.pdf")
.build();
ConvertDocumentResponse response = doclingClient.convertSource(request);
} catch (HttpClientErrorException.NotFound ex) {
log.warn("Document not found: {}", ex.getMessage());
} catch (HttpClientErrorException ex) {
log.error("Client error during conversion: {}", ex.getMessage());
} catch (HttpServerErrorException ex) {
log.error("Server error during conversion: {}", ex.getMessage());
}