This ensures that errors which happen during logging to syslog are logged
with proper context, such as "while logging to syslog" and the server name.
Prodded by Safar Safarly.
During initial startup the ngx_cycle->hostname is not available, and
previously this resulted in incorrect logging. Instead, hostname from the
configuration being parsed is now preserved in the syslog peer structure
and then used during logging.
Similarly, ngx_cycle->log might not match the configuration where the
syslog peer is defined if the configuration is not yet fully applied,
and previously this resulted in unexpected logging of syslog errors
and debug information. Instead, cf->cycle->new_log is now referenced
in the syslog peer structure and used for logging, similarly to how it
is done in other modules.
An UTF-8 octet sequence cannot start with a 11111xxx byte (above 0xf8),
see https://datatracker.ietf.org/doc/html/rfc3629#section-3. Previously,
such bytes were accepted by ngx_utf8_decode() and misinterpreted as 11110xxx
bytes (as in a 4-byte sequence). While unlikely, this can potentially cause
issues.
Fix is to explicitly reject such bytes in ngx_utf8_decode().
This is expected to help with clients using pipelining with some constant
depth, such as apt[1][2].
When downloading many resources, apt uses pipelining with some constant
depth, a number of requests in flight. This essentially means that after
receiving a response it sends an additional request to the server, and
this can result in requests arriving to the server at any time. Further,
additional requests are sent one-by-one, and can be easily seen as such
(neither as pipelined, nor followed by pipelined requests).
The only safe approach to close such connections (for example, when
keepalive_requests is reached) is with lingering. To do so, now nginx
monitors if pipelining was used on the connection, and if it was, closes
the connection with lingering.
[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=973861#10
[2] https://mailman.nginx.org/pipermail/nginx-devel/2023-January/ZA2SP5SJU55LHEBCJMFDB2AZVELRLTHI.html
Casts are believed to be not needed, since memcmp() has "const void *"
arguments since introduction of the "void" type in C89. And on pre-C89
platforms nginx is unlikely to compile without warnings anyway, as there
are no casts in memcpy() and memmove() calls.
These casts were added in 1648:89a47f19b9ec without any details on why they
were added, and Igor does not remember details either. The most plausible
explanation is that they were copied from ngx_strcmp() and were not really
needed even at that time.
Prodded by Alejandro Colomar.
The check is not expected to fail unless there is a bug in the calling
code. But given the check is here, it should log an alert if it fails
instead of silently closing the connection.
Maximum size for reading the PROXY protocol header is increased to 4096 to
accommodate a bigger number of TLVs, which are supported since cca4c8a715de.
Maximum size for writing the PROXY protocol header is not changed since only
version 1 is currently supported.
The variables have prefix $proxy_protocol_tlv_ and are accessible by name
and by type. Examples are: $proxy_protocol_tlv_0x01, $proxy_protocol_tlv_alpn.
Previously, all received user input was logged. If a multi-line text was
received from client and logged, it could reduce log readability and also make
it harder to parse nginx log by scripts. The change brings to PROXY protocol
the same behavior that exists for HTTP request line in
ngx_http_log_error_handler().
Similar to 70e65bf8dfd7, the change is made to ensure that the ability to
cancel resolver tasks is fully controlled by the caller. As mentioned in the
referenced commit, it is safe to make this timer cancelable because resolve
tasks can have their own timeouts that are not cancelable.
The scenario where this may become a problem is a periodic background resolve
task (not tied to a specific request or a client connection), which receives a
response with short TTL, large enough to warrant fallback to a TCP query.
With each event loop wakeup, we either have a previously set write timer
instance or schedule a new one. The non-cancelable write timer can delay or
block graceful shutdown of a worker even if the ngx_resolver_ctx_t->cancelable
flag is set by the API user, and there are no other tasks or connections.
We use the resolver API in this way to maintain the list of upstream server
addresses specified with the 'resolve' parameter, and there could be third-party
modules implementing similar logic.
FastCGI responder is expected to receive CGI/1.1 environment variables
in the parameters (see section "6.2 Responder" of the FastCGI specification).
Obviously enough, there cannot be multiple environment variables with
the same name.
Further, CGI specification (RFC 3875, section "4.1.18. Protocol-Specific
Meta-Variables") explicitly requires to combine headers: "If multiple
header fields with the same field-name are received then the server MUST
rewrite them as a single value having the same semantics".
Previously, QUIC used the existing UDP framework, which was created for UDP in
Stream. However the way QUIC connections are created and looked up is different
from the way UDP connections in Stream are created and looked up. Now these
two implementations are decoupled.
Response headers can be buffered in the SSL buffer. But stream's fake
connection buffered flag did not reflect this, so any attempts to flush
the buffer without sending additional data were stopped by the write filter.
It does not seem to be possible to reflect this in fc->buffered though, as
we never known if main connection's c->buffered corresponds to the particular
stream or not. As such, fc->buffered might prevent request finalization
due to sending data on some other stream.
Fix is to implement handling of flush buffers when the c->need_flush_buf
flag is set, similarly to the existing last buffer handling. The same
flag is now used for UDP sockets in the stream module instead of explicit
checking of c->type.
The SF_NOCACHE flag, introduced in FreeBSD 11 along with the new non-blocking
sendfile() implementation by glebius@, makes it possible to use sendfile()
along with the "directio" directive.
Starting with FreeBSD 11, there is no need to use AIO operations to preload
data into cache for sendfile(SF_NODISKIO) to work. Instead, sendfile()
handles non-blocking loading data from disk by itself. It still can, however,
return EBUSY if a page is already being loaded (for example, by a different
process). If this happens, we now post an event for the next event loop
iteration, so sendfile() is retried "after a short period", as manpage
recommends.
The limit of the number of EBUSY tolerated without any progress is preserved,
but now it does not result in an alert, since on an idle system event loop
iteration might be very short and EBUSY can happen many times in a row.
Instead, SF_NODISKIO is simply disabled for one call once the limit is
reached.
With this change, sendfile(SF_NODISKIO) is now used automatically as long as
sendfile() is enabled, and no longer requires "aio on;".
Notably, NAXSI is known to misuse ngx_regex_compile() with rc.options set
to PCRE_CASELESS | PCRE_MULTILINE. With PCRE2 support, and notably binary
compatibility changes, it is no longer possible to set PCRE[2]_MULTILINE
option without using proper interface. To facilitate correct usage,
this change adds the NGX_REGEX_MULTILINE option.
With this change, dynamic modules using nginx regex interface can be used
regardless of the variant of the PCRE library nginx was compiled with.
If a module is compiled with different PCRE library variant, in case of
ngx_regex_exec() errors it will report wrong function name in error
messages. This is believed to be tolerable, given that fixing this will
require interface changes.
The PCRE2 library is now used by default if found, instead of the
original PCRE library. If needed for some reason, this can be disabled
with the --without-pcre2 configure option.
To make it possible to specify paths to the library and include files
via --with-cc-opt / --with-ld-opt, the library is first tested without
any additional paths and options. If this fails, the pcre2-config script
is used.
Similarly to the original PCRE library, it is now possible to build PCRE2
from sources with nginx configure, by using the --with-pcre= option.
It automatically detects if PCRE or PCRE2 sources are provided.
Note that compiling PCRE2 10.33 and later requires inttypes.h. When
compiling on Windows with MSVC, inttypes.h is only available starting
with MSVC 2013. In older versions some replacement needs to be provided
("echo '#include <stdint.h>' > pcre2-10.xx/src/inttypes.h" is good enough
for MSVC 2010).
The interface on nginx side remains unchanged.
If a configuration parsing fails for some reason, ngx_regex_module_init()
is not called, and ngx_pcre_studies remained set despite the fact that
the pool it was allocated from is already freed. This might result in
a segmentation fault during runtime regular expression compilation, such
as in SSI, for example, in the single process mode, or if a worker process
died and was respawned from a master process in such an inconsistent state.
Fix is to clear ngx_pcre_studies from the pool cleanup handler (which is
anyway used to free JIT-compiled patterns).
Without this change, aio used with HTTP/2 can result in connection hang,
as observed with "aio threads; aio_write on;" and proxying (ticket #2248).
The problem is that HTTP/2 updates buffers outside of the output filters
(notably, marks them as sent), and then posts a write event to call
output filters. If a filter does not call the next one for some reason
(for example, because of an AIO operation in progress), this might
result in a state when the owner of a buffer already called
ngx_chain_update_chains() and can reuse the buffer, while the same buffer
is still sitting in the busy chain of some other filter.
In the particular case a buffer was sitting in output chain's ctx->busy,
and was reused by event pipe. Output chain's ctx->busy was permanently
blocked by it, and this resulted in connection hang.
Fix is to change ngx_chain_update_chains() to skip buffers from other
modules unconditionally, without trying to wait for these buffers to
become empty.
Previously, connections to upstream servers used sendfile() if it was
enabled, but never honored sendfile_max_chunk. This might result
in worker monopolization for a long time if large request bodies
are allowed.