Edit on GitHub

webup

WebUp is a Python package for uploading websites to Amazon Web Services S3 buckets.

  • Uploads files and subdirectories
  • Multi-threaded for concurrent parallel uploads
  • Sets Cache-Control and Content-Type headers

Usage

To upload a directory with the default configuration, just call upload:

from webup import upload

upload("./public", bucket="my-bucket")

If the bucket's name is recorded in Systems Manager, pass a parameter name instead of a bucket name:

from webup import upload

upload("./public", ssm_param="/my-platform/buckets/website")

Some content types are baked-in. To add more content types:

from webup import set_content_type, upload

set_content_type(".foo", "application/foo")
upload("./public", bucket="my-bucket")

All files have the Cache-Control value "max-age=60" by default. To configure this:

from webup import set_default_maximum_age, set_maximum_age, upload

# Serve sw.js with Cache-Control "max-age=0":
set_maximum_age(r"^sw\.js$", 0)

# Serve all other JavaScript with Cache-Control "max-age=600":
set_maximum_age(r"^.*\.js$", 600)

# Serve all other files with Cache-Control "max-age=300":
set_default_maximum_age(300)

upload("./public", bucket="my-bucket")

To perform a dry-run:

from webup import upload

upload("./public", bucket="my-bucket", read_only=True)

Configuration

Cache-Control headers

By default, every object will be assigned the Cache-Control header "max-age=60".

To set a maximum age per file type, call set_maximum_age.

To set the default content type, call set_default_maximum_age.

Content-Type headers

Filename Content-Type
.css text/css
.eot application/vnd.m-fontobject
.html text/html
.js text/javascript
.png image/png
.ttf font/ttf
.woff font/woff
.woff2 font/woff2
* application/octet-stream

To add additional content types, call set_content_type.

To set the default content type, call set_default_content_type.

View Source
"""
**WebUp** is a Python package for uploading websites to Amazon Web Services S3
buckets.

- Uploads files and subdirectories
- Multi-threaded for concurrent parallel uploads
- Sets Cache-Control and Content-Type headers

## Usage

To upload a directory with the default configuration, just call `upload`:

```python
from webup import upload

upload("./public", bucket="my-bucket")
```

If the bucket's name is recorded in Systems Manager, pass a parameter name
instead of a bucket name:

```python
from webup import upload

upload("./public", ssm_param="/my-platform/buckets/website")
```

Some content types are baked-in. To add more content types:

```python
from webup import set_content_type, upload

set_content_type(".foo", "application/foo")
upload("./public", bucket="my-bucket")
```

All files have the Cache-Control value "max-age=60" by default. To configure this:

```python
from webup import set_default_maximum_age, set_maximum_age, upload

# Serve sw.js with Cache-Control "max-age=0":
set_maximum_age(r"^sw\\.js$", 0)

# Serve all other JavaScript with Cache-Control "max-age=600":
set_maximum_age(r"^.*\\.js$", 600)

# Serve all other files with Cache-Control "max-age=300":
set_default_maximum_age(300)

upload("./public", bucket="my-bucket")
```

To perform a dry-run:

```python
from webup import upload

upload("./public", bucket="my-bucket", read_only=True)
```

## Configuration

### Cache-Control headers

By default, every object will be assigned the Cache-Control header "max-age=60".

To set a maximum age per file type, call `set_maximum_age`.

To set the default content type, call `set_default_maximum_age`.

### Content-Type headers

| Filename | Content-Type                 |
|----------|------------------------------|
| .css     | text/css                     |
| .eot     | application/vnd.m-fontobject |
| .html    | text/html                    |
| .js      | text/javascript              |
| .png     | image/png                    |
| .ttf     | font/ttf                     |
| .woff    | font/woff                    |
| .woff2   | font/woff2                   |
| *        | application/octet-stream     |

To add additional content types, call `set_content_type`.

To set the default content type, call `set_default_content_type`.
"""

import importlib.resources

from webup.cache_control import set_default_maximum_age, set_maximum_age
from webup.content_type import set_content_type, set_default_content_type
from webup.queue import upload

with importlib.resources.open_text(__package__, "VERSION") as t:
    __version__ = t.readline().strip()

__all__ = [
    # This is intentionally the first so it's at the top of the API
    # documentation.
    "upload",
    "set_default_content_type",
    "set_default_maximum_age",
    "set_content_type",
    "set_maximum_age",
]
#   def upload( dir: str | pathlib.Path, bucket: str | None = None, concurrent_uploads: int = 8, out: Optional[IO[str]] = None, read_only: bool = False, region: str | None = None, ssm_param: str | None = None ) -> None:
View Source
def upload(
    dir: str | Path,
    bucket: str | None = None,
    concurrent_uploads: int = 8,
    out: IO[str] | None = None,
    read_only: bool = False,
    region: str | None = None,
    ssm_param: str | None = None,
) -> None:
    """
    Uploads the local directory `dir` to an S3 bucket.

    The bucket name must be set directly via `bucket` or read from the Systems
    Manager parameter named `ssm_param`.

    If `region` is not set then the default region will be used.

    `concurrent_uploads` describes the maximum number of concurrent upload
    threads to allow.

    `out` describes an optional string writer for progress reports. The same
    information will be available via logging, but this could be used for human-
    readable output (especially of read-only runs) at run-time. To write
    progress to the console, set to `sys.stdout`.

    If `read_only` is truthy then directories will be walked and files will be
    read, but nothing will be uploaded.
    """

    bucket = bucket_name(bucket, ssm_param, region)

    if isinstance(dir, str):
        dir = Path(dir)

    dir = dir.resolve().absolute()

    _logger.debug(
        "Starting %s concurrent %s of %s to %s in %s.",
        concurrent_uploads,
        "read-only uploads" if read_only else "uploads",
        dir,
        bucket,
        region,
    )

    files = Files(dir)
    output = Output(max_path=files.max_path, out=out) if out else None
    process_count = 0
    queue: "Queue[UploadResult]" = Queue(concurrent_uploads)
    wip: List[str] = []

    while True:
        full = len(wip) >= concurrent_uploads

        if wip:
            # If we *can* take on more work then don't wait; hurry up and add
            # more threads. Wait only when there's nothing more we can do.
            timeout = 1 if full else None
            check(output=output, queue=queue, timeout=timeout, wip=wip)

        if full:
            continue

        if file := files.next:

            wip.append(file.key)

            upload = Upload(
                bucket=bucket,
                cache_control=cache_control(file.path.relative_to(dir)),
                content_type=content_type(file.path.suffix),
                key=file.key,
                path=file.path.as_posix(),
                queue=queue,
                read_only=read_only,
                region=region,
            )

            upload.start()
            process_count += 1
            continue

        if not wip:
            _logger.debug("No files remaining. Upload complete.")
            return

Uploads the local directory dir to an S3 bucket.

The bucket name must be set directly via bucket or read from the Systems Manager parameter named ssm_param.

If region is not set then the default region will be used.

concurrent_uploads describes the maximum number of concurrent upload threads to allow.

out describes an optional string writer for progress reports. The same information will be available via logging, but this could be used for human- readable output (especially of read-only runs) at run-time. To write progress to the console, set to sys.stdout.

If read_only is truthy then directories will be walked and files will be read, but nothing will be uploaded.

#   def set_default_content_type(type: str = 'application/octet-stream') -> None:
View Source
def set_default_content_type(type: str = "application/octet-stream") -> None:
    """
    Sets the default Content-Type header for file types not registered via
    `set_content_type`.

    Defaults to "application/octet-stream".
    """

    global _default_content_type
    _default_content_type = type

Sets the default Content-Type header for file types not registered via set_content_type.

Defaults to "application/octet-stream".

#   def set_default_maximum_age(seconds: int = 60) -> None:
View Source
def set_default_maximum_age(seconds: int = 60) -> None:
    """
    Sets the Cache-Control "max-age" value for files not registered via
    `set_maximum_age`.

    Defaults to 60 seconds.
    """

    global _default_max_age
    _default_max_age = seconds

Sets the Cache-Control "max-age" value for files not registered via set_maximum_age.

Defaults to 60 seconds.

#   def set_content_type(suffix: str, type: str) -> None:
View Source
def set_content_type(suffix: str, type: str) -> None:
    """
    Registers the Content-Type header for files with the `suffix` filename
    extension.
    """

    suffix = normalize_suffix(suffix)
    _content_types[suffix] = type

Registers the Content-Type header for files with the suffix filename extension.

#   def set_maximum_age(pattern: str, seconds: int) -> None:
View Source
def set_maximum_age(pattern: str, seconds: int) -> None:
    """
    Sets the "max-age" value of the Cache-Control header for files that match
    the given pattern.
    """

    _max_ages[pattern] = seconds

Sets the "max-age" value of the Cache-Control header for files that match the given pattern.