smp-server: support namespaces#1784
Conversation
a6b960b to
236e69d
Compare
37764a4 to
c3e7b61
Compare
… TLD case, shutdown)
…ngEndpoint timeout)
- Reject IPv6 aliases of 169.254.169.254 (IPv4-compatible / IPv4-mapped / 6to4 / NAT64) via numeric range check on parsed IPv6. - Disable HTTP redirects on the Eth RPC request. - Restrict SimplexName labels to ASCII (Cyrillic/Greek/full-width otherwise hash to different on-chain records and diverge from UTS-46 registrars). - pingEndpoint: only JsonRpcErr means "reachable"; transport/decode failures fail startup. boundedIniInt: readMaybe over partial read. - Add 127.0.0.0/8 and 0.0.0.0 to isLoopback. - Replace hand-rolled hex helpers with Data.ByteArray.Encoding; raise managerConnCount to match rpcMaxConcurrency; hex Show for NameOwner. - Fuse parallel http/https when into unless+case; drop reverse/re-reverse in mkDomain TLDWeb; first AbiInvariantViolated; Nothing <$ decodeAddress; forM_ (eitherToMaybe ...); >>= chain in NameOwner FromJSON. - Drop dead imports/exports/pragmas and two restating comments. - Tests: factor unsafeOwner/unsafeLink, addr1/2/3, testNamesConfig; add non-ASCII label rejection coverage.
The bare-name fallback and bareDomain parser would otherwise consume arbitrarily many non-space bytes via takeWhile1 before any validation or length check. A crafted multi-megabyte token would be decoded as UTF-8 and re-parsed in full before being rejected. Introduce `boundedNonSpace` (scan with 253-byte cap) at the two takeWhile1 sites. Inputs longer than 253 bytes leave residue that parseOnly's implicit endOfInput rejects, so the parser fails fast without ever allocating the full input. The bound is the DNS full-domain limit, chosen for being a familiar ceiling generous enough to cover any realistic SimpleX name (longest plausible @user.subdomain.simplex stays well under 100 bytes). No per-label cap — SimpleX names don't go through DNS label resolution and there's no semantic reason to constrain individual labels.
e90b13e to
9e4d8d9
Compare
…out auth) validateUrl gains two operator-friendly relaxations and a regression test: - Allow a path prefix (e.g. https://gw.example.com:443/snrc) for a resolver behind a reverse-proxy sub-path; /resolve/<name> and /health are appended (HttpResolver already strips one trailing slash, so root and sub-path behave identically). Query/fragment/userinfo stay rejected. - Off-loopback, reject only http WITH resolver_auth (the Authorization header would travel in cleartext). http without auth is now allowed (no secret to leak; resolver data is public — also lets dev setups reach a host resolver via http://host.docker.internal). https is always allowed, with or without auth. Plain http has no response integrity; intended for trusted/local networks only. Exports validateUrl and adds validateUrlSpec (11 cases) to SMPNamesTests.
RSLV collapsed every non-hit (no resolver, malformed name, not found, backing-store failure) to ERR AUTH, so a client iterating its configured servers could not tell "this router has no resolver, try the next" from "name not registered, stop", and a transient backend error read as an authoritative miss. Names capability is runtime config, orthogonal to the linear SMP version (a future v21 router without [NAMES] must still advertise v21), so it is signalled by a command-time error like allowSMPProxy, not by the version range: no resolver configured -> ERR CMD PROHIBITED (client skips, tries next) backing-store failure -> ERR INTERNAL (transient: retry/surface) not found / malformed -> ERR AUTH (authoritative "no such name") Update the protocol spec error table and add agent tests for the no-resolver (CMD PROHIBITED) and backend-failure (INTERNAL) paths.
Addresses epoberezkin's review (PR #1784). Name resolution becomes a server role like proxy; the agent owns resolution + server selection; one error type flows through the whole stack. - ServerRoles gains `names`; UserServers gains `nameSrvs` (opt-in list); resolveSimplexName drops the explicit server arg and picks a names-capable server via getNextServer. - RSLV carries SimplexNameDomain (was RslvRequest): no JSON on the wire, contract dropped, name validated at parse (invalid -> CMD SYNTAX). - Version check moves from the encoder to Client.hs (no ERR to server). - ErrorType.NAME {nameErr :: NameErrorType} (+ AgentErrorType.NAME), wire- and JSON-encoded; resolver errors surface with diagnostics. Success response renamed NAME -> RNAME to free the collision. - NameOwner -> EthAddress (record selector); NameRecord derives FromJSON and gains field-ordered Encoding; per-field caps removed. - Remove newEnvWithNames / runSMPServerBlockingWithNames test seams; stub resolver folded into ServerConfig.namesResolverCall_.
NameResolverStatsData adds 6 lines to the server stats backup (the "rslvStats:" header plus the reqs/succ/notFound/resolverErrs/disabled fields), so testRestoreMessages' expected stats-backup line count is 95 -> 101.
| -- Wire encoding for the SMP NAME response: field-ordered smpEncode, not embedded | ||
| -- JSON. Field order = record declaration order. EthAddress encodes as its raw | ||
| -- 20 bytes (length-prefixed via the ByteString instance). | ||
| instance Encoding NameRecord where |
There was a problem hiding this comment.
wire encoding should use JSON
| namesEnv <- case namesConfig of | ||
| Nothing -> pure Nothing | ||
| Just nc -> case namesResolverCall_ of |
There was a problem hiding this comment.
| namesEnv <- case namesConfig of | |
| Nothing -> pure Nothing | |
| Just nc -> case namesResolverCall_ of | |
| namesEnv <- forM namesConfig $ \nc -> case namesResolverCall_ of |
There was a problem hiding this comment.
this is a strange way to inject stub. Shouldn't it be the same endpoint?
| Nothing -> pure Nothing | ||
| Just nc -> case namesResolverCall_ of | ||
| -- test seam: stub resolver, no real HTTP env or startup probe | ||
| Just call -> Just <$> newNamesEnvWith nc call Nothing |
There was a problem hiding this comment.
| Just call -> Just <$> newNamesEnvWith nc call Nothing | |
| Just call -> newNamesEnvWith nc call Nothing |
| pingEndpoint env >>= \case | ||
| Right _ -> logInfo "[NAMES] endpoint probe ok" | ||
| Left e -> logWarn $ "[NAMES] endpoint probe failed (server will still start, RSLV will return ERR (NAME ...) until reachable): " <> tshow e | ||
| pure (Just env) |
There was a problem hiding this comment.
| pure (Just env) | |
| pure env |
| Nothing -> do | ||
| logInfo $ "[NAMES] resolver enabled, endpoint=" <> scrubUrl (resolverEndpoint nc) | ||
| when allowSMPProxy $ | ||
| logWarn "[NAMES] enable: on on a proxy-role host: slow RSLV calls can serialise other forwarded commands on the same proxy-relay session. For high-volume deployments, run [NAMES] on a separate host." |
There was a problem hiding this comment.
why? this is a strange comment. We allow SMP proxy on all hosts, name resolution should be async in the same way all proxy calls are async.
| \# Operator runs the resolver alongside smp-server (default port 8000)\n\ | ||
| \# with its own Ethereum JSON-RPC endpoint configured in resolver.toml.\n\ | ||
| \# Co-locating with the proxy role logs a startup advisory: slow RSLV calls can\n\ | ||
| \# serialise other forwarded commands on the same proxy-relay session.\n\ | ||
| \# For high-volume deployments, run [NAMES] on a separate host.\n\ | ||
| \# Restart required to change settings.\n\ |
There was a problem hiding this comment.
this is wrong and should not be the case
There was a problem hiding this comment.
also, we don't colocate it with SMP, right?
| else case J.eitherDecodeStrict (BL.toStrict bs) of | ||
| Left e -> pure (Left (InvalidJson e)) | ||
| Right v -> pure (Right v) |
There was a problem hiding this comment.
| else case J.eitherDecodeStrict (BL.toStrict bs) of | |
| Left e -> pure (Left (InvalidJson e)) | |
| Right v -> pure (Right v) | |
| else first InvalidJson <$> J.eitherDecodeStrict (BL.toStrict bs) |
There was a problem hiding this comment.
in general, please review and refactor such trivial cases into forM, mapM, first, second, bimap, maybe, etc.
|
|
||
| doGet :: ResolverEnv -> Text -> IO (Either ResolverError J.Value) | ||
| doGet ResolverEnv {manager, baseUrl, authHdr, timeoutMicro, maxResponseBytes} path = do | ||
| req0 <- parseRequest (T.unpack (baseUrl <> path)) |
There was a problem hiding this comment.
this T.unpack is a smell that incorrect type is passed from upstream
| -- | Percent-encode a name component (path-safe). Aggressive: encode every | ||
| -- byte that isn't an unreserved character per RFC 3986. The resolver expects | ||
| -- raw labels (e.g., `alice.simplex`); slashes and other ASCII punctuation | ||
| -- would change the request path semantics if passed through verbatim. | ||
| percentEncode :: Text -> Text | ||
| percentEncode = decodeLatin1 . urlEncode True . encodeUtf8 |
There was a problem hiding this comment.
trivial function that is not needed if you use correct types (String or ByteString)
| -- and the smpEncoded NameRecord is <= its JSON body, so capping | ||
| -- the body here guarantees the response always frames. An | ||
| -- over-cap body fails as BodyTooLarge -> ERR (NAME (RESOLVER ..)). | ||
| resolverMaxResponseBytes = boundedIniInt 16000 1024 16000 "resolver_max_response_bytes" |
There was a problem hiding this comment.
and in some other place it's 64kb and no 16. If we need larger size, we could use zstd compression in SMP protocol.
| other -> Left $ "unexpected port syntax: " <> other | ||
| unless (null (uriQuery uri)) $ Left "query string not allowed (it does not compose with the appended /resolve/<name> path)" | ||
| unless (null (uriFragment uri)) $ Left "fragment not allowed (fragments are never sent to the server)" | ||
| -- A path prefix is allowed and used as the base for /resolve/<name> and |
There was a problem hiding this comment.
this is very complex and unreadable, needs to be simplified
There was a problem hiding this comment.
we use library for URI parsing, why we are re-implement it here?
| -- * link-local hosts (169.254.0.0/16, including the cloud metadata IP | ||
| -- 169.254.169.254) are rejected unconditionally | ||
| validateUrl :: Text -> Maybe RpcAuth -> Either String Text | ||
| validateUrl url auth_ = do |
There was a problem hiding this comment.
this function needs to be fully re-written or used from library. It has to be like 1-3 lines of code, not 50.
| -- | Parse an rpc_auth INI value. Scheme keyword is case-insensitive so | ||
| -- "Bearer <token>" / "BEARER <token>" (Caddy / RFC 7235 convention) work | ||
| -- as well as the lowercase form. | ||
| parseRpcAuth :: Text -> Either String RpcAuth |
There was a problem hiding this comment.
same here - it's an existing library function, it should not be in our code.
There was a problem hiding this comment.
the library is uri-bytestring, just needs to be added and used
| | -- | no name-resolving servers configured (agent-originated only) | ||
| NO_SERVERS |
There was a problem hiding this comment.
as it's agent-originated, it must not be in this type - it must be in Agent errors
There was a problem hiding this comment.
This type is only for errors returned by the server, it can't contain errors reported upstream
| = -- | the names role / resolver is not configured on this server | ||
| NO_RESOLVER | ||
| | -- | the name is not registered (resolver returned not-found) | ||
| NO_NAME |
There was a problem hiding this comment.
NOT_FOUND would be more typical
There may also be expired error - I don't know how it will be reported by resolver, so probably not needed for now.
| -- Name is validated at parse (invalid syntax fails here -> CMD error), | ||
| -- so the handler only ever sees a valid SimplexNameDomain. |
There was a problem hiding this comment.
| -- Name is validated at parse (invalid syntax fails here -> CMD error), | |
| -- so the handler only ever sees a valid SimplexNameDomain. |
redundant
| PRXY host auth_ -> e (PRXY_, ' ', host, auth_) | ||
| PFWD fwdV pubKey (EncTransmission s) -> e (PFWD_, ' ', fwdV, pubKey, Tail s) | ||
| RFWD (EncFwdTransmission s) -> e (RFWD_, ' ', Tail s) | ||
| -- Version gating is the client's job (Client.hs), not the encoder's. |
There was a problem hiding this comment.
| -- Version gating is the client's job (Client.hs), not the encoder's. |
| PFWD fwdV pubKey (EncTransmission s) -> e (PFWD_, ' ', fwdV, pubKey, Tail s) | ||
| RFWD (EncFwdTransmission s) -> e (RFWD_, ' ', Tail s) | ||
| -- Version gating is the client's job (Client.hs), not the encoder's. | ||
| RSLV d -> e (RSLV_, ' ', Tail (strEncode d)) |
There was a problem hiding this comment.
| RSLV d -> e (RSLV_, ' ', Tail (strEncode d)) | |
| RSLV d -> e (RSLV_, ' ', d) |
There was a problem hiding this comment.
maybe I am missing something, but using Tail explicitly here is redundant - all it achieves is preventing length prefixing of the string. Directly using d here achieves the same.
| -- so the handler only ever sees a valid SimplexNameDomain. | ||
| CT SResolver RSLV_ -> do | ||
| Tail bs <- _smpP | ||
| either fail (pure . Cmd SResolver . RSLV) (strDecode bs) |
There was a problem hiding this comment.
| either fail (pure . Cmd SResolver . RSLV) (strDecode bs) | |
| Cmd . SResolver . RSLV <$> _smpP |
| PONG -> e PONG_ | ||
| -- Field-ordered Encoding NameRecord (no JSON on the wire); a response that | ||
| -- arrived is already on a supported version, so no version gate. | ||
| RNAME rec -> e (RNAME_, ' ', rec) |
There was a problem hiding this comment.
| RNAME rec -> e (RNAME_, ' ', rec) | |
| RNAME rec -> e (RNAME_, ' ', Tail $ toStrict $ J.encode rec) |
| OK_ -> pure OK | ||
| ERR_ -> ERR <$> _smpP | ||
| PONG_ -> pure PONG | ||
| RNAME_ -> RNAME <$> _smpP |
There was a problem hiding this comment.
| RNAME_ -> RNAME <$> _smpP | |
| RNAME_ -> fmap RNAME . J.decodeStrict . unTail <$?> _smpP |
| smpEncode = \case | ||
| NO_RESOLVER -> "NO_RESOLVER" | ||
| NO_NAME -> "NO_NAME" | ||
| NO_SERVERS -> "NO_SERVERS" |
| -- error that reaches the client as ERR (NAME ...). | ||
| (selector, msg) <- asks namesEnv >>= \case | ||
| Nothing -> pure (rslvDisabled, ERR $ NAME NO_RESOLVER) | ||
| Just nenv -> liftIO (resolveName nenv d) <&> \case |
There was a problem hiding this comment.
that's the crux of the problem. You call resolver inline, blocking further command processing (and that's what all those comments were about). This has to use the same pattern as we use to create proxy session - it must be processed asynchronously, and response send from forked thread. The function simply returns Nothing as command response, deferring response to forked thread.
No description provided.