[DISCUSS] Prototype a Base64Encoder#2452
Conversation
|
The only issue I see is that Squid base64 encoding is only used on buffers. So we would end up adding a stream just for the purpose of using the encoder. |
rousskov
left a comment
There was a problem hiding this comment.
I had already flagged this API direction as problematic before this PR was posted. Will find time to detail that claim.
rousskov
left a comment
There was a problem hiding this comment.
IMO, the serious problems identified in this incomplete review are enough to warrant switching to a different approach to base64-encoding. There are other, mostly secondary problems in this PR that this review does not flag. Most will probably disappear on the adjusted path.
I can suggest starting with an API like this:
namespace AnyP {
/// A base64-encoded version of the given input.
SBuf Base64Encode(const SBuf &);
} // namespace AnyP| * This class inherits from std::ostream to provide a familiar streaming | ||
| * interface, similar to SBufStream. | ||
| */ | ||
| class Base64Encoder : public std::ostream |
There was a problem hiding this comment.
The proposed inheritance is an API mistake for several reasons, including:
-
std::ostreamis meant for writing bytes to various destinations (as defined by derived classes). A base64 encoder should not know or care about the destination of the encoded bytes. I should be able to use the same encoder for writing into a file, a socket, or an SBuf. Changing high-level information into written bytes is the domain of formatting I/O functions (e.g., STL-provided<<operators) and I/O manipulators, notstd::ostreamkids. -
Base64 encoding requires a special action to finalize the result.
std::ostreamdoes not have a caller-convenient, safe, and efficient way for triggering that action. Neither class destructor norbuf()are it, for various reasons. -
Formatted and locale-specific output (e.g., various STL-provided
<<operators and manipulators) is not fully compatible with many (all?) known base64 use cases in Squid code, where callers must base64-encode values while following strict syntax rules. An extra space character or the wrong number representation is likely to violate the protocol requirements but is not going to be caught by the compiler. In this context, we probably want to highlight value conversions rather than hide them behind various convenience APIs.
| Base64Encoder::Base64StreamBuf::sync() | ||
| { | ||
| encoder_.encodePending(); | ||
| encoder_.finalize(); |
There was a problem hiding this comment.
This sync() implementation essentially violates STL streambuf::sync() API. STL code that we do not control should be able to call this method multiple times, as needed, without being worried that the very first call effectively places (or should place) the stream into a "no more updates" state, a state that Base64Encoder enters when its finalize() method is called.
|
|
||
| /// Create a Base64Encoder with optional maximum encoded output size limit | ||
| /// \param maxEncodedSize maximum encoded output size (default: noLimit) | ||
| explicit Base64Encoder(size_t maxEncodedSize = noLimit); |
There was a problem hiding this comment.
Burdening this SBuf-based class with explicitly maintaining output size restrictions is probably wrong. The underlying code should not care. Modern users should not care either. Legacy users, if any, would be better off with checking when converting from SBuf into their own storage.
Stemming from the converstaions in PR #2447 which focuses on
addressing a problem as narrowly as possible , here
is what I would envision a base64 encoder to look like.
It's a prototype, vibe-coded with lots of hand-holding on my side,
so please don't focus too much on the code itself, it comes with
a comprehensive unit test which is meant to showcase the API.
Does anyone see anything obviously wrong with the approach or is
it worth investing into by reviewing, polishing, writing a decoder and using it?