README updated

2025-11-07 09:52:28 -06:00
parent b15174af8b
commit 96ce7867de
1 changed files with 107 additions and 76 deletions
--- a/README.md
+++ b/README.md
@@ -49,17 +49,11 @@ Common parameters and returns (applies to all items below):
 - into: optional output buffer (see below)
 - maclen: MAC tag length 16 or 32 bytes (default 16)

-Only the first few can be positional arguments that are always provided in this order. All arguments can be passed as kwargs.
+Only the first few can be positional arguments that are always provided in this order. All arguments can be passed as kwargs. The inputs can be any Buffer supporting len() (e.g. `bytes`, `bytearray`, `memoryview`).

 Most functions return a buffer of bytes. By default a `bytearray` of the correct size is returned. An existing buffer can be provided by `into` argument, in which case the bytes of it that were written to are returned as a memoryview.

-
- random_key() -> bytes (correct length for the module)
- random_nonce() -> bytes (correct length for the module)
-
-Constants (per module): KEYBYTES, NPUBBYTES, ABYTES_MIN, ABYTES_MAX, RATE, ALIGNMENT
-
-### One-shot AEAD:
+### One-shot AEAD

 Encrypt and decrypt messages with built-in authentication:
 - encrypt(key, nonce, message, ad=None, maclen=16, into=None) -> ct_with_mac
@@ -73,7 +67,7 @@ No MAC tag, vulnerable to alterations:
 - encrypt_unauthenticated(key, nonce, message, into=None) -> ciphertext  (testing only)
 - decrypt_unauthenticated(key, nonce, ct, into=None) -> plaintext        (testing only)

-### Incremental AEAD:
+### Incremental AEAD

 Stateful classes that can be used for processing the data in separate chunks:
 - Encryptor(key, nonce, ad=None)
@@ -83,7 +77,7 @@ Stateful classes that can be used for processing the data in separate chunks:
    - update(ct_chunk[, into]) -> plaintext_chunk
    - final(mac) -> None (raises ValueError on failure)

-### Message Authentication Code:
+### Message Authentication Code

 No encryption, but prevents changes to the data without the correct key.

@@ -93,96 +87,146 @@ No encryption, but prevents changes to the data without the correct key.
    - final(maclen=16[, into]) -> mac
    - verify(mac) -> bool (True on success; raises ValueError on failure)

-### Keystream generation:
+### Keystream generation

 Useful for creating pseudo random bytes as rapidly as possible. Reuse of the same (key, nonce) creates identical output.

- stream(key, nonce=None, length=None, into=None) -> pseudorandom bytes (for tests/PRNG-like use)
+- stream(key, nonce=None, length=None, into=None) -> randombytes
+
+### Miscellaneous
+
+Constants (per module): KEYBYTES, NPUBBYTES, ABYTES_MIN, ABYTES_MAX, RATE, ALIGNMENT
+
+- random_key() -> bytearray (length KEYBYTES)
+- random_nonce() -> bytearray (length NPUBBYTES)
+- nonce_increment(nonce)
+- wipe(buffer)
+
+### Exceptions
+
+- Authentication failures raise ValueError.
+- Invalid sizes/types raise TypeError.
+- Unexpected errors from libaegis raise RuntimeError.


 ## Examples

-Detached tag (ct, mac)
+### Authentication only

+A cryptographically secure keyed hash is produced. The example uses all zeroes for the nonce to always produce the same hash for the same key:
 ```python
-from pyaegis import aegis256x4 as a
-key, nonce = a.random_key(), a.random_nonce()
-ct, mac = a.encrypt_detached(key, nonce, b"secret", ad=b"hdr", maclen=32)
-pt = a.decrypt_detached(key, nonce, ct, mac, ad=b"hdr")
+from pyaegis import aegis256x4 as ciph
+key, nonce = ciph.random_key(), bytes(ciph.NPUBBYTES)
+
+mac = ciph.mac(key, nonce, b"message", maclen=32)
+print(mac)
+
+st = ciph.Mac(key, nonce)
+st.update(b"message")
+st.update(b"Mallory Says Hello!")
+st.verify(mac)  # Raises ValueError
 ```

-Incremental:
+### Detached mode encryption and decryption
+
+Keeping the ciphertext, mac and ad separate. The ad represents a file header that needs to be tamper proofed.

 ```python
-from pyaegis import aegis256x4 as a
-key, nonce = a.random_key(), a.random_nonce()
+from pyaegis import aegis256x4 as ciph
+key, nonce = ciph.random_key(), ciph.random_nonce()

-enc = a.Encryptor(key, nonce, ad=b"hdr")
+ct, mac = ciph.encrypt_detached(key, nonce, b"secret", ad=b"header")
+pt = ciph.decrypt_detached(key, nonce, ct, mac, ad=b"header")
+print(ct, mac, pt)
+
+ciph.wipe(key)  # Zero out sensitive buffers after use (recommended)
+ciph.wipe(pt)
+```
+
+### Incremental updates
+
+Class-based interface for incremental updates is an alternative to the one-shot functions. Not to be confused with separately verified ciphertext frames (see the next example).
+
+```python
+from pyaegis import aegis256x4 as ciph
+key, nonce = ciph.random_key(), ciph.random_nonce()
+
+enc = a.Encryptor(key, nonce, ad=b"header")
 c1 = enc.update(b"chunk1")
 c2 = enc.update(b"chunk2")
-mac = enc.final(maclen=16)   # returns only the tag
+mac = enc.final(maclen=16)

-dec = a.Decryptor(key, nonce, ad=b"hdr")
+dec = a.Decryptor(key, nonce, ad=b"header")
 p1 = dec.update(c1)
 p2 = dec.update(c2)
 dec.final(mac)               # raises ValueError on failure
 ```

-MAC-only:
+### Large data AEAD encryption/decryption
+
+It is often practical to split larger messages into frames that can be individually decrypted and verified. Because every frame needs a different key, we employ the `nonce_increment` utility function to produce sequential nonces for each frame. As for the AEGIS algorithm, each frame is a completely independent invocation. The program will each time produce a completely different random-looking encrypted.bin file.

 ```python
-from pyaegis import aegis256x4 as a
-key, nonce = a.random_key(), a.random_nonce()
+from pyaegis import aegis128x4 as ciph

-mac = a.mac(key, nonce, b"data", maclen=32)
+message = bytearray(30 * b"Attack at dawn! ")
+key = b"sixteenbyte key!"  # 16 bytes secret key for aegis128* algorithms
+nonce = ciph.random_nonce()
+framebytes = 80  # In real applications 1 MiB or more is practical
+maclen = ciph.ABYTES_MIN  # 16

-st = a.Mac(key, nonce)
-st.update(b"data")
-st.update(b"more data")
-st.verify(mac)  # True or raises ValueError
+with open("encrypted.bin", "wb") as f:
+    f.write(nonce)  # Public initial nonce sent with the ciphertext
+    while message:
+        chunk = message[:framebytes - maclen]
+        del message[:len(chunk)]
+        ct = ciph.encrypt(key, nonce, chunk, maclen=maclen)
+        ciph.nonce_increment(nonce)
+        f.write(ct)
 ```

-Pre-allocated buffers (avoid allocations):
-
 ```python
-from pyaegis import aegis256x4 as a
-key, nonce = a.random_key(), a.random_nonce()
-msg = b"data"
+from pyaegis import aegis128x4 as ciph

-out = bytearray(len(msg) + 16)
-view = a.encrypt(key, nonce, msg, into=out)
-assert bytes(view) == bytes(out)
+# Decryption needs same values as encryption
+key = b"sixteenbyte key!"
+framebytes = 80
+maclen = ciph.ABYTES_MIN
+
+with open("encrypted.bin", "rb") as f:
+    nonce = bytearray(f.read(ciph.NPUBBYTES))
+    while True:
+        frame = f.read(framebytes)
+        if not frame:
+            break
+        pt = ciph.decrypt(key, nonce, frame, maclen=maclen)
+        ciph.nonce_increment(nonce)
+        print(pt)
 ```

-In-place (same buffer for input and into):
+### Preallocated output buffers (into=)
+
+For advanced use cases, the output buffer can be supplied with `into` kwarg. Any type of writable buffer with len() >= space required can be used. This includes bytearrays, memoryviews, mmap files, numpy.getbuffer etc.
+
+A `TypeError` is raised if the buffer is too small. For convenience, the functions return a memoryview showing only the bytes actually written.
+
+In-place operations are supported when the input and the output point to the same location in memory. When using attached MAC tag, the input buffer needs to be sliced to correct length:

 ```python
-from pyaegis import aegis256x4 as a
-key, nonce = a.random_key(), a.random_nonce()
+from pyaegis import aegis256x4 as ciph
+key, nonce = ciph.random_key(), ciph.random_nonce()
+buf = memoryview(bytearray(1000))  # memoryview[:len] is still in the same buffer (no copy)
+buf[:7] = b"message"

-# Attached tag: place plaintext at the start of a buffer that has room for the tag
-msg = b"secret"
-maclen = 16
-buf = bytearray(len(msg) + maclen)
-buf[: len(msg)] = msg
-m = memoryview(buf)[: len(msg)]
+# Each function returns a memoryview capped to correct length
+ct = ciph.encrypt(key, nonce, buf[:7], into=buf)
+pt = ciph.decrypt(key, nonce, ct, into=buf)

-# Encrypt in-place (ciphertext written back into buf, tag appended)
-a.encrypt(key, nonce, m, into=buf)
-
-# Decrypt back in-place (plaintext written over the start region)
-a.decrypt(key, nonce, buf, into=m)  # uses default maclen=16
-assert bytes(m) == msg
-
-# Detached mode: ciphertext written back to the same buffer
-buf2 = bytearray(len(msg))
-buf2[:] = msg
-m2 = memoryview(buf2)
-ct_view, mac = a.encrypt_detached(key, nonce, m2, ct_into=buf2)
-a.decrypt_detached(key, nonce, ct_view, mac, into=m2)
-assert bytes(m2) == msg
+print(bytes(pt))
 ```

+Detached and unauthenticated modes can use same size input and output (no MAC added to ciphertext). Detached encryption instead of `into` takes `ct_into` and `mac_into` separately and returns memoryviews to both.
+
 ## Performance

 Runtime CPU feature detection selects optimized code paths (AES-NI, ARM Crypto, AVX2/AVX-512). Multi-lane variants (x2/x4) offer higher throughput on suitable CPUs.
@@ -193,7 +237,7 @@ Run the built-in benchmark to see which variant is fastest on your machine:
 python -m pyaegis.benchmark
 ```

-Benchmarks of the Python module and the C library run on Intel i7-14700, linux, single core (the software is not multithreaded). Note that the results are in megabits per second, not bytes. The CPU lacks AVX-512 that makes the X4 variants faster on AMD hardware.
+Benchmarks of the Python module and the C library run on Intel i7-14700, linux, single core (the software is not multithreaded). Note that the results are in megabits per second, not bytes. The CPU lacks AVX-512 that makes the X4 variants faster on AMD hardware. The Python library performance is similar to that of the C library.

 <table>
 <tr>
@@ -233,16 +277,3 @@ AEGIS-256X4 MAC  392088.05 Mb/s
 </td>
 </tr>
 </table>
-
-## Errors
-
- Authentication failures raise ValueError.
- Invalid sizes/types raise TypeError.
- Unexpected errors from libaegis raise RuntimeError.
-
-## Security notes
-
- Never reuse a nonce with the same key. Prefer a.random_nonce() per message.
- Keep keys secret; use a.random_key() to get correctly sized keys.
- AAD (ad=...) is authenticated but not encrypted.
- Do not use stream() or unauthenticated helpers for real data protection; they are for testing and specialized cases.