Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions .dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
__pycache__
*.pyc
.env
venv
.venv
.vscode
.idea
.git
.github

41 changes: 41 additions & 0 deletions .github/workflows/test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
name: tests

on:
push:
branches: [main]
pull_request:
branches: [main]

jobs:
test:
runs-on: ubuntu-latest

steps:
- name: Check out repository
uses: actions/checkout@v4

- name: Set up Python 3.12
uses: actions/setup-python@v5
with:
python-version: 3.12

- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -r requirements.txt
pip install pytest
pip install coverage
pip install .

- name: Run tests
run: coverage run -m pytest

- name: Make coverage report
run: coverage lcov

- name: Comment coverage report on PR
if: ${{ github.event_name == 'pull_request' }}
uses: romeovs/lcov-reporter-action@v0.3.1
with:
lcov-file: coverage.lcov
delete-old-comments: true
161 changes: 107 additions & 54 deletions .gptcontext
Original file line number Diff line number Diff line change
Expand Up @@ -2,23 +2,13 @@ Additional context is provided below.

Preferences for python code:
- adhere to common style conventions, e.g. PEP8
- keep lines under 80 characters long
- you MUST keep lines under 80 characters long

Markdown2confluence pushes a folder containing markdown files and pushes them to confluence, with a page structure like the file and folder structure of the markdown files, and ignoring any non-markdown files.

Required behavior:
All pages managed by markdown2confluence contains $CONFLUENCE_PAGE_TITLE_SUFFIX, e.g. '(autogenerated)'. New pages are created with this suffix, and on subsequent runs any pages with the suffix (or label, TBD) are overwritten or deleted.
Depending on how confluence labels work it might be best to use labels instead. If using labels, refuse to delete any pages that does not have the page title suffix.
Any markdown that contains full or relative links to local media files should be published as pages with attached media. Relative links in markdown to local media are resolved from the location of the markdown file. Full-path links in markdown are resolved from the $MARKDOWN_FOLDER


Currently I am working on:
- Publisher class in publish.py contains the old code for now, I am moving
functionality to the other classes.
- Change from directly using requests to using the confluence client from
atlassian
- Use labels instead of only relying on the suffix (previously called search
pattern)
- All pages managed by markdown2confluence contains a suffix, e.g. '(autogenerated)'. New pages are created with this suffix, and on subsequent runs any pages with the suffix (or label, TBD) are overwritten or deleted. Depending on how confluence labels work it might be best to use labels instead. If using labels, refuse to delete any pages that does not have the page title suffix.
- Any markdown that contains full or relative links to local media files should be published as pages with attached media. Relative links in markdown to local media are resolved from the location of the markdown file. Full-path links in markdown are resolved from the $MARKDOWN_FOLDER


file structure:
Expand All @@ -29,12 +19,15 @@ markdown2confluence/
│ └── usage.md
├── LICENCE
├── markdown2confluence
│ ├── converter.py
│ ├── __init_.py
│ ├── main.py
│ ├── converter.py
│ ├── confluence.py
│ ├── config.py
│ ├── file_manager.py
│ ├── content_tree.py
│ ├── parser.py
│ ├── util.py
│ ├── version.py
│ └── publisher.py
├── README.md
├── requirements.txt
Expand All @@ -48,7 +41,7 @@ markdown2confluence/
│ └── test_integration.py
└── unit
├── __init__.py
├── test_file_manager.py
├── test_parser.py
├── test_confluence.py
└── test_publisher.py

Expand Down Expand Up @@ -79,63 +72,123 @@ CONFLUENCE_IGNOREFILE

#### Components and Their Key Interfaces

1. **ConfluenceClient**
1. **Publisher**

Responsible for direct interactions with the Confluence API, handling operations like page creation, updates, deletion, and labeling with retries and backoff for robustness.
Abstract Publisher class for publishing a content tree, respecting the ContentTree structure and managing page relationships.

```python
class ConfluenceClient:
def __init__(self, confluence_config: dict):
"""Initialize with API configuration."""

def create_or_update_page(self, title: str, html: str, parent_id=None, space_key: str, labels=None) -> dict:
"""Create or update a Confluence page, applying labels."""

def delete_page(self, page_id: str) -> dict:
"""Delete a Confluence page by ID."""
class Publisher:
@abstractmethod
def publish_node(self, node: ContentNode, parent_id: str | None) -> str:
pass

def pre_publish_hook(self):
"""
Optional step for actions to perform before publishing, such as
fetching/deleting previously published resources.
Can be overridden by subclasses.
"""
pass

def post_publish_hook(self):
"""
Optional step for actions to perform after publishing, such as
cleaning up resources or performing additional logging.
Can be overridden by subclasses.
"""
pass

def publish_content(self, content_tree: ContentTree):
"""
Traverse a content tree and call publish_node on each element.
"""
pass
```

2. **Publisher**
2. **ConfluencePublisher**

Orchestrates the conversion of Markdown to HTML and the subsequent publishing to Confluence, respecting the original directory structure and managing page relationships.
Specialized publisher for confluence, implements the publish_node function responsible for creating/updating pages with labels etc in confluence

```python
class Publisher:
def __init__(self, confluence_client: ConfluenceClient, source_directory: str, space_key: str):
"""Setup with Confluence client, source directory, and target space key."""

def publish(self):
"""Main method to start the publishing process."""

def traverse_directory(self, directory: str, parent_id=None):
"""Recursively traverse directories, converting and uploading Markdown files."""
class ConfluencePublisher(Publisher):
def __init__(self, confluence: Confluence = None):
pass

def pre_publish_hook(self):
"""
Specialized for this subclass.
Fetch all pages matching space, label and suffix
"""

def post_publish_hook(self):
"""
Specialized for this subclass.
Delete pages not in the ContentTree
"""

def publish_node(self, node: ContentNode, parent_id: str | None) -> str:
"""
Create or update pages, including attachments, ensuring labels on newly created pages.
"""
pass
```

3. **FileManager** (unchanged, conceptual)
3. **Parser**

Handles file reading and potentially logging or other file outputs, and maybe traversing the file system
Responsible for parsing the source files from e.g. the file system.

```python
class FileManager:
def read_file(self, path: str) -> str:
"""Read the content of a file."""
class Parser(ABC):
@abstractmethod
def parse_directory(self, directory: str) -> ContentTree:
pass


class MarkdownParser(Parser):
def parse_directory(self, directory: str) -> ContentTree:
pass
```

### Workflow Overview with Snippets
4. **ContentTree**

Defines the shared data structure for content between Parser and Publisher

- The process starts with `Publisher`, which is initialized with necessary configurations and an instance of `ConfluenceClient`.

```python
publisher = Publisher(confluence_client=ConfluenceClient(confluence_config), source_directory="path/to/markdown", space_key="SPACEKEY")
publisher.publish()
```
@dataclass
class ContentNode:
name: str
content: str | None = None
metadata: dict | None = None
parent: 'ContentNode | None' = None
children: dict[str, 'ContentNode'] = field(default_factory=dict)

def add_child(self, node: 'ContentNode'):
pass

def get_child(self, name: str) -> 'ContentNode | None':
pass

- `Publisher.publish()` begins the process, invoking `traverse_directory()` to walk through the directory structure, processing each Markdown file by converting it to HTML.
def is_leaf(self) -> bool:
pass

- For each processed file, `Publisher` uses `ConfluenceClient.create_or_update_page()` to either create a new page or update an existing one in Confluence, applying a predefined label to mark the page as managed by `markdown2confluence`.
def is_root(self) -> bool:
pass

- Should a page need to be deleted or labels added, `Publisher` utilizes other methods of `ConfluenceClient` like `delete_page()` and maybe `add_labels_to_page()`, ensuring the Confluence space remains synchronized with the source content.
def __str__(self, level: int = 0) -> str:
pass

### Conclusion

This architecture, enriched with interface snippets, outlines a clear, modular approach to converting and managing Markdown content within Confluence, ensuring scalability and maintainability through well-defined responsibilities and robust Confluence API interactions.
@dataclass
class ContentTree:
root: ContentNode = field(default_factory=lambda: ContentNode('root'))

def add_node(self, path_list: list, content: str | None = None,
metadata: dict | None = None):
pass

def find_node(self, path_list: list) -> ContentNode | None:
pass

def __str__(self) -> str:
pass
```
14 changes: 8 additions & 6 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,21 +1,23 @@
FROM python:3.10-slim
FROM python:3.11-slim

WORKDIR /app

COPY requirements.txt /app/

RUN pip install --no-cache-dir -r requirements.txt

COPY . /app/

# Install the current package
RUN pip install .

ENV CONFLUENCE_USERNAME=""
ENV CONFLUENCE_PASSWORD=""
ENV CONFLUENCE_URL="https://yourdomain.atlassian.net/wiki/rest/api/"
ENV CONFLUENCE_URL="https://yourdomain.atlassian.net/wiki/"
ENV CONFLUENCE_SPACE_ID="yourspace"
ENV CONFLUENCE_PARENT_PAGE_ID="12345"
ENV CONFLUENCE_PAGE_TITLE_SUFFIX="(autogenerated)"
ENV CONFLUENCE_PAGE_LABEL="markdown2confluence"
ENV MARKDOWN_FOLDER="./"
ENV MARKDOWN_SOURCE_REF=""

COPY ./markdown2confluence /app

CMD ["python", "/app/main.py"]
CMD ["python", "markdown2confluence/main.py"]
16 changes: 16 additions & 0 deletions Pipfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
[[source]]
url = "https://pypi.org/simple"
verify_ssl = true
name = "pypi"

[packages]
atlassian-python-api = "*"
markdown = "*"

[dev-packages]
setuptools = "*"
pytest-watch = "*"
pytest = "*"

[requires]
python_version = "3.11"
Loading