Discussion:
Any way to invoke code from VCL after a request has been serviced?
Ryan Burn
2017-05-06 02:15:30 UTC
Permalink
Hello,
From VCL, is it possible to execute code that runs after a request has
been processed?

I'm looking into writing a module that enables Varnish for distributed
tracing using the OpenTracing project [opentracing.io]. This requires
invoking code at the beginning of a request to start a span and insert
tracing context into the request's headers and invoking code after a
request's been processed to finish the span and measure how long it
took to process.

I recently did a similar project for nginx
[github.com/rnburn/nginx-opentracing]. Nginx provides an
NGX_HTTP_LOG_PHASE [www.nginxguts.com/2011/01/phases/] that allows you
to set up handlers run after requests are serviced. Can anything
equivalent be done using VCL?

I image you could accomplish this by subscribing and regularly reading
from Varnish's shared memory log, but I'd much rather do it directly
if possible.

Thanks, Ryan
Reza Naghibi
2017-05-08 15:10:23 UTC
Permalink
You can do this in a VMOD via PRIV_TASK:


--
Reza Naghibi
Varnish Software
Post by Ryan Burn
Hello,
From VCL, is it possible to execute code that runs after a request has
been processed?
I'm looking into writing a module that enables Varnish for distributed
tracing using the OpenTracing project [opentracing.io]. This requires
invoking code at the beginning of a request to start a span and insert
tracing context into the request's headers and invoking code after a
request's been processed to finish the span and measure how long it
took to process.
I recently did a similar project for nginx
[github.com/rnburn/nginx-opentracing]. Nginx provides an
NGX_HTTP_LOG_PHASE [www.nginxguts.com/2011/01/phases/] that allows you
to set up handlers run after requests are serviced. Can anything
equivalent be done using VCL?
I image you could accomplish this by subscribing and regularly reading
from Varnish's shared memory log, but I'd much rather do it directly
if possible.
Thanks, Ryan
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Reza Naghibi
2017-05-08 15:13:25 UTC
Permalink
Sorry, email misfire.

You can do this in a VMOD via PRIV_TASK:

https://varnish-cache.org/docs/trunk/reference/vmod.html#private-pointers

It might make sense to track this stuff in some kind of struct, in which
case, put it into *priv and then register a *free callback. Otherwise, just
put a dummy value into the *priv. *free will get called after the request
is done and you can put your custom code in there.

--
Reza Naghibi
Varnish Software
Post by Reza Naghibi
--
Reza Naghibi
Varnish Software
Post by Ryan Burn
Hello,
From VCL, is it possible to execute code that runs after a request has
been processed?
I'm looking into writing a module that enables Varnish for distributed
tracing using the OpenTracing project [opentracing.io]. This requires
invoking code at the beginning of a request to start a span and insert
tracing context into the request's headers and invoking code after a
request's been processed to finish the span and measure how long it
took to process.
I recently did a similar project for nginx
[github.com/rnburn/nginx-opentracing]. Nginx provides an
NGX_HTTP_LOG_PHASE [www.nginxguts.com/2011/01/phases/] that allows you
to set up handlers run after requests are serviced. Can anything
equivalent be done using VCL?
I image you could accomplish this by subscribing and regularly reading
from Varnish's shared memory log, but I'd much rather do it directly
if possible.
Thanks, Ryan
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Ryan Burn
2017-05-12 00:33:57 UTC
Permalink
From the free function, is there any way to get the status code or
other properties of the request? I tried using VRT_r_obj_status with a
stored reference to the context, but that doesn't seem to work since
some of the request's resources have already been reclaimed:

https://github.com/rnburn/varnish-opentracing/blob/master/opentracing/src/trace.cpp#L22

Is there any other place something like the status would be stored?
Post by Reza Naghibi
Sorry, email misfire.
https://varnish-cache.org/docs/trunk/reference/vmod.html#private-pointers
It might make sense to track this stuff in some kind of struct, in which
case, put it into *priv and then register a *free callback. Otherwise, just
put a dummy value into the *priv. *free will get called after the request is
done and you can put your custom code in there.
--
Reza Naghibi
Varnish Software
Post by Reza Naghibi
--
Reza Naghibi
Varnish Software
Post by Ryan Burn
Hello,
From VCL, is it possible to execute code that runs after a request has
been processed?
I'm looking into writing a module that enables Varnish for distributed
tracing using the OpenTracing project [opentracing.io]. This requires
invoking code at the beginning of a request to start a span and insert
tracing context into the request's headers and invoking code after a
request's been processed to finish the span and measure how long it
took to process.
I recently did a similar project for nginx
[github.com/rnburn/nginx-opentracing]. Nginx provides an
NGX_HTTP_LOG_PHASE [www.nginxguts.com/2011/01/phases/] that allows you
to set up handlers run after requests are serviced. Can anything
equivalent be done using VCL?
I image you could accomplish this by subscribing and regularly reading
from Varnish's shared memory log, but I'd much rather do it directly
if possible.
Thanks, Ryan
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Guillaume Quintard
2017-05-12 01:38:02 UTC
Permalink
you can put anything in the priv field of the task, but the issue is that
you have to put that data in there, meaning a call to your vmod from the
vcl.

the VUT.dispatch_f function isn't to be called from a vmod, and I don't
think you need to.

Maybe it's time to take a step back, can you fill us in the whole workflow,
notably:
- what data do you inject, and how do you create it?
- what do you need to know about the req/resp/bereq/beresp?

I almost have the feeling that this could be solved through pure vcl+shell.
--
Guillaume Quintard
Post by Ryan Burn
From the free function, is there any way to get the status code or
other properties of the request? I tried using VRT_r_obj_status with a
stored reference to the context, but that doesn't seem to work since
https://github.com/rnburn/varnish-opentracing/blob/
master/opentracing/src/trace.cpp#L22
Is there any other place something like the status would be stored?
Post by Reza Naghibi
Sorry, email misfire.
https://varnish-cache.org/docs/trunk/reference/vmod.
html#private-pointers
Post by Reza Naghibi
It might make sense to track this stuff in some kind of struct, in which
case, put it into *priv and then register a *free callback. Otherwise,
just
Post by Reza Naghibi
put a dummy value into the *priv. *free will get called after the
request is
Post by Reza Naghibi
done and you can put your custom code in there.
--
Reza Naghibi
Varnish Software
Post by Reza Naghibi
--
Reza Naghibi
Varnish Software
Post by Ryan Burn
Hello,
From VCL, is it possible to execute code that runs after a request has
been processed?
I'm looking into writing a module that enables Varnish for distributed
tracing using the OpenTracing project [opentracing.io]. This requires
invoking code at the beginning of a request to start a span and insert
tracing context into the request's headers and invoking code after a
request's been processed to finish the span and measure how long it
took to process.
I recently did a similar project for nginx
[github.com/rnburn/nginx-opentracing]. Nginx provides an
NGX_HTTP_LOG_PHASE [www.nginxguts.com/2011/01/phases/] that allows you
to set up handlers run after requests are serviced. Can anything
equivalent be done using VCL?
I image you could accomplish this by subscribing and regularly reading
from Varnish's shared memory log, but I'd much rather do it directly
if possible.
Thanks, Ryan
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Ryan Burn
2017-05-12 05:20:03 UTC
Permalink
Sure. The intention with OpenTracing is to provide a common API that
can be used to instrument frameworks and libraries. What happens when,
for example, a span is created or its context injected into a request
header isn’t specified by the standard. It’s up to the particular
tracing implementation used (e.g. LightStep, Zipkin, Jaeger, etc) to
decide what specifically to do.

So, if a user wants to enable varnish for OpenTracing, I’d expect them
do something like the following in their VCL:

###
# This is distributed as part of the varnish-opentracing project. It
imports a varnish module
# that exposes VCL functions to interact with the C++ OpenTracing API
# https://github.com/opentracing/opentracing-cpp
# and adds commands to the VCL built-in subroutines so that the
module’s functions will
# be invoked when certain events occur.
include “opentracing.vcl”;


# A user also needs to select a tracing implementation to use. This is
done by importing
# the implementation’s module and initializing the tracer in vcl_init.
For example, if they’re
# using LightStep they might do something like this
import lightstep;

sub vcl_init {
lightstep.collector_host(“…”);
lightstep.collector_port(“…”);
lightstep.init_tracer();
}


# Tracing is then explicitly turned on for a particular request with
logic in the vcl_recv subroutine.
# This means that a span will be created for the request and any
backend requests that result from it.
# The trace’s context will also be propagated into the backend request
headers, so that any tracing done
# by the backend server can be linked to it.
sub vcl_recv {
# This turns on tracing for all requests.
opentracing.trace_request();
}
###


Though all the pieces aren’t together, I have an example set up here

https://github.com/rnburn/varnish-opentracing/blob/master/example/library/varnish/library.vcl

To go back to the questions:

- what data do you inject, and how do you create it?
You would be injecting a list of key-value pairs that represent the
context of the active span
(https://github.com/opentracing/specification/blob/master/specification.md#spancontext).
Specifically what that means is up to the tracing implementation, but
it would look something like this:


'ot-tracer-traceid': 'f84de504f0287bbc' // An ID used to
identify the Trace.
'ot-tracer-spanid': 'e34088878e7f0ce8' // An ID used to identify
the active span within the Trace.
'ot-tracer-sampled': 'true' // Heuristic
used by the Tracer

- what do you need to know about the req/resp/bereq/beresp?
Knowing whether the request resulted in an error is pretty important
to record. Other data usually added
are the URI, http method, ip addresses of the client, server, and
backend servers. Some of the guidelines on what to include are
documented here:

https://github.com/opentracing/specification/blob/master/semantic_conventions.md


An example might make this clearer. This shows the breakdown of a
trace representing the action of a user submitting a profile form:

Loading Image...

The server (described in more detail here
https://github.com/rnburn/nginx-opentracing/blob/master/doc/Tutorial.md)
uses nginx as a reverse proxy in front of Node.js servers that update
a database and perform image manipulation. You can see spans created
on the nginx side to track the duration of the request and how long it
passes through various location blocks as well as spans created from
the Node.js server to represent the database activity and image
manipulation. Injecting context into the request headers is what
allows the spans to be linked together so that the entire trace can be
formed.

On Thu, May 11, 2017 at 9:38 PM, Guillaume Quintard
Post by Guillaume Quintard
you can put anything in the priv field of the task, but the issue is that
you have to put that data in there, meaning a call to your vmod from the
vcl.
the VUT.dispatch_f function isn't to be called from a vmod, and I don't
think you need to.
Maybe it's time to take a step back, can you fill us in the whole workflow,
- what data do you inject, and how do you create it?
- what do you need to know about the req/resp/bereq/beresp?
I almost have the feeling that this could be solved through pure vcl+shell.
--
Guillaume Quintard
Post by Ryan Burn
From the free function, is there any way to get the status code or
other properties of the request? I tried using VRT_r_obj_status with a
stored reference to the context, but that doesn't seem to work since
https://github.com/rnburn/varnish-opentracing/blob/master/opentracing/src/trace.cpp#L22
Is there any other place something like the status would be stored?
Post by Reza Naghibi
Sorry, email misfire.
https://varnish-cache.org/docs/trunk/reference/vmod.html#private-pointers
It might make sense to track this stuff in some kind of struct, in which
case, put it into *priv and then register a *free callback. Otherwise, just
put a dummy value into the *priv. *free will get called after the request is
done and you can put your custom code in there.
--
Reza Naghibi
Varnish Software
On Mon, May 8, 2017 at 11:10 AM, Reza Naghibi
Post by Reza Naghibi
--
Reza Naghibi
Varnish Software
Post by Ryan Burn
Hello,
From VCL, is it possible to execute code that runs after a request has
been processed?
I'm looking into writing a module that enables Varnish for distributed
tracing using the OpenTracing project [opentracing.io]. This requires
invoking code at the beginning of a request to start a span and insert
tracing context into the request's headers and invoking code after a
request's been processed to finish the span and measure how long it
took to process.
I recently did a similar project for nginx
[github.com/rnburn/nginx-opentracing]. Nginx provides an
NGX_HTTP_LOG_PHASE [www.nginxguts.com/2011/01/phases/] that allows you
to set up handlers run after requests are serviced. Can anything
equivalent be done using VCL?
I image you could accomplish this by subscribing and regularly reading
from Varnish's shared memory log, but I'd much rather do it directly
if possible.
Thanks, Ryan
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Guillaume Quintard
2017-05-15 21:19:50 UTC
Permalink
Thanks Ryan, I think I have a clearer picture now.

So, indeed, I think you only need a only a light vmod and a log analyzer.

If I get what's happening, you need a vmod call in vcl_recv to create a
trace if it doesn't exist yet, and in vcl_backend_fetch to create a span.
Then, you can simply look at the log, and you'll have all the meta data you
need (including IP, ports, timings and such).

You may want a specific function to set the component name per-request, but
that can easily be done through std.log, so I wouldn't worry about it at
first.

Am I completely off, or is that at least remotely sensible?
--
Guillaume Quintard
Post by Ryan Burn
Sure. The intention with OpenTracing is to provide a common API that
can be used to instrument frameworks and libraries. What happens when,
for example, a span is created or its context injected into a request
header isn’t specified by the standard. It’s up to the particular
tracing implementation used (e.g. LightStep, Zipkin, Jaeger, etc) to
decide what specifically to do.
So, if a user wants to enable varnish for OpenTracing, I’d expect them
###
# This is distributed as part of the varnish-opentracing project. It
imports a varnish module
# that exposes VCL functions to interact with the C++ OpenTracing API
# https://github.com/opentracing/opentracing-cpp
# and adds commands to the VCL built-in subroutines so that the
module’s functions will
# be invoked when certain events occur.
include “opentracing.vcl”;
# A user also needs to select a tracing implementation to use. This is
done by importing
# the implementation’s module and initializing the tracer in vcl_init.
For example, if they’re
# using LightStep they might do something like this
import lightstep;
sub vcl_init {
lightstep.collector_host(“
”);
lightstep.collector_port(“
”);
lightstep.init_tracer();
}
# Tracing is then explicitly turned on for a particular request with
logic in the vcl_recv subroutine.
# This means that a span will be created for the request and any
backend requests that result from it.
# The trace’s context will also be propagated into the backend request
headers, so that any tracing done
# by the backend server can be linked to it.
sub vcl_recv {
# This turns on tracing for all requests.
opentracing.trace_request();
}
###
Though all the pieces aren’t together, I have an example set up here
https://github.com/rnburn/varnish-opentracing/blob/master/example/library/
varnish/library.vcl
- what data do you inject, and how do you create it?
You would be injecting a list of key-value pairs that represent the
context of the active span
(https://github.com/opentracing/specification/
blob/master/specification.md#spancontext).
Specifically what that means is up to the tracing implementation, but
'ot-tracer-traceid': 'f84de504f0287bbc' // An ID used to
identify the Trace.
'ot-tracer-spanid': 'e34088878e7f0ce8' // An ID used to identify
the active span within the Trace.
'ot-tracer-sampled': 'true' // Heuristic
used by the Tracer
- what do you need to know about the req/resp/bereq/beresp?
Knowing whether the request resulted in an error is pretty important
to record. Other data usually added
are the URI, http method, ip addresses of the client, server, and
backend servers. Some of the guidelines on what to include are
https://github.com/opentracing/specification/blob/master/semantic_
conventions.md
An example might make this clearer. This shows the breakdown of a
https://github.com/rnburn/nginx-opentracing/blob/master/
doc/data/nginx-upload-trace5.png
The server (described in more detail here
https://github.com/rnburn/nginx-opentracing/blob/master/doc/Tutorial.md)
uses nginx as a reverse proxy in front of Node.js servers that update
a database and perform image manipulation. You can see spans created
on the nginx side to track the duration of the request and how long it
passes through various location blocks as well as spans created from
the Node.js server to represent the database activity and image
manipulation. Injecting context into the request headers is what
allows the spans to be linked together so that the entire trace can be
formed.
On Thu, May 11, 2017 at 9:38 PM, Guillaume Quintard
Post by Guillaume Quintard
you can put anything in the priv field of the task, but the issue is that
you have to put that data in there, meaning a call to your vmod from the
vcl.
the VUT.dispatch_f function isn't to be called from a vmod, and I don't
think you need to.
Maybe it's time to take a step back, can you fill us in the whole
workflow,
Post by Guillaume Quintard
- what data do you inject, and how do you create it?
- what do you need to know about the req/resp/bereq/beresp?
I almost have the feeling that this could be solved through pure
vcl+shell.
Post by Guillaume Quintard
--
Guillaume Quintard
Post by Ryan Burn
From the free function, is there any way to get the status code or
other properties of the request? I tried using VRT_r_obj_status with a
stored reference to the context, but that doesn't seem to work since
https://github.com/rnburn/varnish-opentracing/blob/
master/opentracing/src/trace.cpp#L22
Post by Guillaume Quintard
Post by Ryan Burn
Is there any other place something like the status would be stored?
On Mon, May 8, 2017 at 11:13 AM, Reza Naghibi <
Post by Reza Naghibi
Sorry, email misfire.
https://varnish-cache.org/docs/trunk/reference/vmod.
html#private-pointers
Post by Guillaume Quintard
Post by Ryan Burn
Post by Reza Naghibi
It might make sense to track this stuff in some kind of struct, in
which
Post by Guillaume Quintard
Post by Ryan Burn
Post by Reza Naghibi
case, put it into *priv and then register a *free callback. Otherwise, just
put a dummy value into the *priv. *free will get called after the request is
done and you can put your custom code in there.
--
Reza Naghibi
Varnish Software
On Mon, May 8, 2017 at 11:10 AM, Reza Naghibi
Post by Reza Naghibi
--
Reza Naghibi
Varnish Software
Post by Ryan Burn
Hello,
From VCL, is it possible to execute code that runs after a request
has
Post by Guillaume Quintard
Post by Ryan Burn
Post by Reza Naghibi
Post by Reza Naghibi
Post by Ryan Burn
been processed?
I'm looking into writing a module that enables Varnish for
distributed
Post by Guillaume Quintard
Post by Ryan Burn
Post by Reza Naghibi
Post by Reza Naghibi
Post by Ryan Burn
tracing using the OpenTracing project [opentracing.io]. This
requires
Post by Guillaume Quintard
Post by Ryan Burn
Post by Reza Naghibi
Post by Reza Naghibi
Post by Ryan Burn
invoking code at the beginning of a request to start a span and
insert
Post by Guillaume Quintard
Post by Ryan Burn
Post by Reza Naghibi
Post by Reza Naghibi
Post by Ryan Burn
tracing context into the request's headers and invoking code after a
request's been processed to finish the span and measure how long it
took to process.
I recently did a similar project for nginx
[github.com/rnburn/nginx-opentracing]. Nginx provides an
NGX_HTTP_LOG_PHASE [www.nginxguts.com/2011/01/phases/] that allows
you
Post by Guillaume Quintard
Post by Ryan Burn
Post by Reza Naghibi
Post by Reza Naghibi
Post by Ryan Burn
to set up handlers run after requests are serviced. Can anything
equivalent be done using VCL?
I image you could accomplish this by subscribing and regularly
reading
Post by Guillaume Quintard
Post by Ryan Burn
Post by Reza Naghibi
Post by Reza Naghibi
Post by Ryan Burn
from Varnish's shared memory log, but I'd much rather do it directly
if possible.
Thanks, Ryan
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Ryan Burn
2017-05-16 03:23:52 UTC
Permalink
Definitely closer. But I’m not sure how that approach would work
without having the log analyzer portion built into the VMOD. A
restriction of the OpenTracing API is that the functions to start,
attach tags, and finish a span all need to act on the same object — it
looks roughly like this:

span = tracer.StartSpan(/* start time */, /* parent-span if exists /*);
span.SetTag(/* key */, /* value */);
span.Finish(/* finish time */);

You couldn’t, for example, have the span created in a VMOD and then
have a separate process analyzing the logs attach the tags and specify
the span’s duration.

Not sure if it’s possible, but if I could use the free function set in
a PRIV_TOP structure to query the status code of the response sent,
that would, I think, work well since I could avoid the complexity of
setting up a VSM reader in a VMOD and pulling out the data from the
log hierarchy.

On Mon, May 15, 2017 at 5:19 PM, Guillaume Quintard
Post by Guillaume Quintard
Thanks Ryan, I think I have a clearer picture now.
So, indeed, I think you only need a only a light vmod and a log analyzer.
If I get what's happening, you need a vmod call in vcl_recv to create a
trace if it doesn't exist yet, and in vcl_backend_fetch to create a span.
Then, you can simply look at the log, and you'll have all the meta data you
need (including IP, ports, timings and such).
You may want a specific function to set the component name per-request, but
that can easily be done through std.log, so I wouldn't worry about it at
first.
Am I completely off, or is that at least remotely sensible?
--
Guillaume Quintard
Post by Ryan Burn
Sure. The intention with OpenTracing is to provide a common API that
can be used to instrument frameworks and libraries. What happens when,
for example, a span is created or its context injected into a request
header isn’t specified by the standard. It’s up to the particular
tracing implementation used (e.g. LightStep, Zipkin, Jaeger, etc) to
decide what specifically to do.
So, if a user wants to enable varnish for OpenTracing, I’d expect them
###
# This is distributed as part of the varnish-opentracing project. It
imports a varnish module
# that exposes VCL functions to interact with the C++ OpenTracing API
# https://github.com/opentracing/opentracing-cpp
# and adds commands to the VCL built-in subroutines so that the
module’s functions will
# be invoked when certain events occur.
include “opentracing.vcl”;
# A user also needs to select a tracing implementation to use. This is
done by importing
# the implementation’s module and initializing the tracer in vcl_init.
For example, if they’re
# using LightStep they might do something like this
import lightstep;
sub vcl_init {
lightstep.collector_host(“…”);
lightstep.collector_port(“…”);
lightstep.init_tracer();
}
# Tracing is then explicitly turned on for a particular request with
logic in the vcl_recv subroutine.
# This means that a span will be created for the request and any
backend requests that result from it.
# The trace’s context will also be propagated into the backend request
headers, so that any tracing done
# by the backend server can be linked to it.
sub vcl_recv {
# This turns on tracing for all requests.
opentracing.trace_request();
}
###
Though all the pieces aren’t together, I have an example set up here
https://github.com/rnburn/varnish-opentracing/blob/master/example/library/varnish/library.vcl
- what data do you inject, and how do you create it?
You would be injecting a list of key-value pairs that represent the
context of the active span
(https://github.com/opentracing/specification/blob/master/specification.md#spancontext).
Specifically what that means is up to the tracing implementation, but
'ot-tracer-traceid': 'f84de504f0287bbc' // An ID used to
identify the Trace.
'ot-tracer-spanid': 'e34088878e7f0ce8' // An ID used to identify
the active span within the Trace.
'ot-tracer-sampled': 'true' // Heuristic
used by the Tracer
- what do you need to know about the req/resp/bereq/beresp?
Knowing whether the request resulted in an error is pretty important
to record. Other data usually added
are the URI, http method, ip addresses of the client, server, and
backend servers. Some of the guidelines on what to include are
https://github.com/opentracing/specification/blob/master/semantic_conventions.md
An example might make this clearer. This shows the breakdown of a
https://github.com/rnburn/nginx-opentracing/blob/master/doc/data/nginx-upload-trace5.png
The server (described in more detail here
https://github.com/rnburn/nginx-opentracing/blob/master/doc/Tutorial.md)
uses nginx as a reverse proxy in front of Node.js servers that update
a database and perform image manipulation. You can see spans created
on the nginx side to track the duration of the request and how long it
passes through various location blocks as well as spans created from
the Node.js server to represent the database activity and image
manipulation. Injecting context into the request headers is what
allows the spans to be linked together so that the entire trace can be
formed.
On Thu, May 11, 2017 at 9:38 PM, Guillaume Quintard
Post by Guillaume Quintard
you can put anything in the priv field of the task, but the issue is that
you have to put that data in there, meaning a call to your vmod from the
vcl.
the VUT.dispatch_f function isn't to be called from a vmod, and I don't
think you need to.
Maybe it's time to take a step back, can you fill us in the whole workflow,
- what data do you inject, and how do you create it?
- what do you need to know about the req/resp/bereq/beresp?
I almost have the feeling that this could be solved through pure vcl+shell.
--
Guillaume Quintard
Post by Ryan Burn
From the free function, is there any way to get the status code or
other properties of the request? I tried using VRT_r_obj_status with a
stored reference to the context, but that doesn't seem to work since
https://github.com/rnburn/varnish-opentracing/blob/master/opentracing/src/trace.cpp#L22
Is there any other place something like the status would be stored?
On Mon, May 8, 2017 at 11:13 AM, Reza Naghibi
Post by Reza Naghibi
Sorry, email misfire.
https://varnish-cache.org/docs/trunk/reference/vmod.html#private-pointers
It might make sense to track this stuff in some kind of struct, in which
case, put it into *priv and then register a *free callback. Otherwise,
just
put a dummy value into the *priv. *free will get called after the request is
done and you can put your custom code in there.
--
Reza Naghibi
Varnish Software
On Mon, May 8, 2017 at 11:10 AM, Reza Naghibi
Post by Reza Naghibi
--
Reza Naghibi
Varnish Software
Post by Ryan Burn
Hello,
From VCL, is it possible to execute code that runs after a request has
been processed?
I'm looking into writing a module that enables Varnish for distributed
tracing using the OpenTracing project [opentracing.io]. This requires
invoking code at the beginning of a request to start a span and insert
tracing context into the request's headers and invoking code after a
request's been processed to finish the span and measure how long it
took to process.
I recently did a similar project for nginx
[github.com/rnburn/nginx-opentracing]. Nginx provides an
NGX_HTTP_LOG_PHASE [www.nginxguts.com/2011/01/phases/] that allows you
to set up handlers run after requests are serviced. Can anything
equivalent be done using VCL?
I image you could accomplish this by subscribing and regularly reading
from Varnish's shared memory log, but I'd much rather do it directly
if possible.
Thanks, Ryan
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Dridi Boukelmoune
2017-05-16 07:53:38 UTC
Permalink
Post by Ryan Burn
Definitely closer. But I’m not sure how that approach would work
without having the log analyzer portion built into the VMOD. A
restriction of the OpenTracing API is that the functions to start,
attach tags, and finish a span all need to act on the same object — it
span = tracer.StartSpan(/* start time */, /* parent-span if exists /*);
span.SetTag(/* key */, /* value */);
span.Finish(/* finish time */);
You couldn’t, for example, have the span created in a VMOD and then
have a separate process analyzing the logs attach the tags and specify
the span’s duration.
How about a VMOD doing the only StartSpan, std.log for logging keys
and whatnot, and the VUT doing all the rest when it processes the
transaction's logs?

The Finish operation appears to take a timestamp anyway, so it doesn't
need to happen exactly when the transaction finishes, does it?

You could also let your VMOD do the logging (SetTag) to not bother
users with the span id for every log statement by putting the id in the
relevant PRIV_ scope. The syntax you are looking for is not available
in VCL.
Post by Ryan Burn
Not sure if it’s possible, but if I could use the free function set in
a PRIV_TOP structure to query the status code of the response sent,
that would, I think, work well since I could avoid the complexity of
setting up a VSM reader in a VMOD and pulling out the data from the
log hierarchy.
Accessing the VSM in a VMOD is possible but I'd recommend not to.

Dridi
Ryan Burn
2017-05-17 21:07:08 UTC
Permalink
Post by Dridi Boukelmoune
How about a VMOD doing the only StartSpan, std.log for logging keys
and whatnot, and the VUT doing all the rest when it processes the
transaction's logs?
But the VUT would be running in a separate process, no? If so, how
does it access the span object returned when you start the span. The
OpenTracing API doesn't support augmenting information to a span
later. The functions to set the tags, finish time, etc all have to act
on the same object that's returned when you first start the span.
Dridi Boukelmoune
2017-05-18 08:45:07 UTC
Permalink
Post by Ryan Burn
Post by Dridi Boukelmoune
How about a VMOD doing the only StartSpan, std.log for logging keys
and whatnot, and the VUT doing all the rest when it processes the
transaction's logs?
But the VUT would be running in a separate process, no? If so, how
does it access the span object returned when you start the span. The
OpenTracing API doesn't support augmenting information to a span
later. The functions to set the tags, finish time, etc all have to act
on the same object that's returned when you first start the span.
You log whatever needs to ultimately be transmitted to the OT server
and let the VUT pick the logs and do the actual API calls. Performing
blocking (API) calls from VCL is also a performance killer, that's why
Varnish logs in memory and can let VUTs find what they need to do the
dirty work.

Take varnishncsa for example. The varnishd process that is technically
the HTTP server doesn't do access logs and instead it's a separate VUT
that picks up the relevant information from the shmlog to produce
them.

Dridi
Ryan Burn
2017-05-18 19:20:28 UTC
Permalink
I don't see how that could work. The context for the active span needs
to be injected into the headers of any backend requests. The headers
need to be modified from the varnishd process, right? And you can't
inject a span context unless the span has been created, so I would
think a VMOD would have to start the span as well; then since the
OpenTracing API to specify other properties of the span requires
acting on the object returned when you start the span, those would all
need to called from the VMOD.

Also, none of the OpenTracing API functions should block for very
long, if at all. The more expensive work such as uploading the tracing
data would happen in a separate thread.
Post by Dridi Boukelmoune
Post by Ryan Burn
Post by Dridi Boukelmoune
How about a VMOD doing the only StartSpan, std.log for logging keys
and whatnot, and the VUT doing all the rest when it processes the
transaction's logs?
But the VUT would be running in a separate process, no? If so, how
does it access the span object returned when you start the span. The
OpenTracing API doesn't support augmenting information to a span
later. The functions to set the tags, finish time, etc all have to act
on the same object that's returned when you first start the span.
You log whatever needs to ultimately be transmitted to the OT server
and let the VUT pick the logs and do the actual API calls. Performing
blocking (API) calls from VCL is also a performance killer, that's why
Varnish logs in memory and can let VUTs find what they need to do the
dirty work.
Take varnishncsa for example. The varnishd process that is technically
the HTTP server doesn't do access logs and instead it's a separate VUT
that picks up the relevant information from the shmlog to produce
them.
Dridi
Guillaume Quintard
2017-05-21 15:56:11 UTC
Permalink
Best way I see for now is to capture a pointer to req in PRIV_TASK and use
is during the release/free operation. You should be okay since that
operation comes before the release of the req, but don't keep anything
after that step.

You'll lose some timing information, as well as the orignal http objects
(you'll get the vcl-transformed ones) though. Would that suffice?
--
Guillaume Quintard
Post by Ryan Burn
I don't see how that could work. The context for the active span needs
to be injected into the headers of any backend requests. The headers
need to be modified from the varnishd process, right? And you can't
inject a span context unless the span has been created, so I would
think a VMOD would have to start the span as well; then since the
OpenTracing API to specify other properties of the span requires
acting on the object returned when you start the span, those would all
need to called from the VMOD.
Also, none of the OpenTracing API functions should block for very
long, if at all. The more expensive work such as uploading the tracing
data would happen in a separate thread.
Post by Dridi Boukelmoune
Post by Ryan Burn
Post by Dridi Boukelmoune
How about a VMOD doing the only StartSpan, std.log for logging keys
and whatnot, and the VUT doing all the rest when it processes the
transaction's logs?
But the VUT would be running in a separate process, no? If so, how
does it access the span object returned when you start the span. The
OpenTracing API doesn't support augmenting information to a span
later. The functions to set the tags, finish time, etc all have to act
on the same object that's returned when you first start the span.
You log whatever needs to ultimately be transmitted to the OT server
and let the VUT pick the logs and do the actual API calls. Performing
blocking (API) calls from VCL is also a performance killer, that's why
Varnish logs in memory and can let VUTs find what they need to do the
dirty work.
Take varnishncsa for example. The varnishd process that is technically
the HTTP server doesn't do access logs and instead it's a separate VUT
that picks up the relevant information from the shmlog to produce
them.
Dridi
Ryan Burn
2017-05-22 22:31:33 UTC
Permalink
I think that might be what I'm looking for. Is there any function like
VRT_r_resp_status, but that operates on the request object instead of
the context, to let me extract the status code sent?

On Sun, May 21, 2017 at 11:56 AM, Guillaume Quintard
Post by Guillaume Quintard
Best way I see for now is to capture a pointer to req in PRIV_TASK and use
is during the release/free operation. You should be okay since that
operation comes before the release of the req, but don't keep anything after
that step.
You'll lose some timing information, as well as the orignal http objects
(you'll get the vcl-transformed ones) though. Would that suffice?
--
Guillaume Quintard
Post by Ryan Burn
I don't see how that could work. The context for the active span needs
to be injected into the headers of any backend requests. The headers
need to be modified from the varnishd process, right? And you can't
inject a span context unless the span has been created, so I would
think a VMOD would have to start the span as well; then since the
OpenTracing API to specify other properties of the span requires
acting on the object returned when you start the span, those would all
need to called from the VMOD.
Also, none of the OpenTracing API functions should block for very
long, if at all. The more expensive work such as uploading the tracing
data would happen in a separate thread.
Post by Dridi Boukelmoune
Post by Ryan Burn
Post by Dridi Boukelmoune
How about a VMOD doing the only StartSpan, std.log for logging keys
and whatnot, and the VUT doing all the rest when it processes the
transaction's logs?
But the VUT would be running in a separate process, no? If so, how
does it access the span object returned when you start the span. The
OpenTracing API doesn't support augmenting information to a span
later. The functions to set the tags, finish time, etc all have to act
on the same object that's returned when you first start the span.
You log whatever needs to ultimately be transmitted to the OT server
and let the VUT pick the logs and do the actual API calls. Performing
blocking (API) calls from VCL is also a performance killer, that's why
Varnish logs in memory and can let VUTs find what they need to do the
dirty work.
Take varnishncsa for example. The varnishd process that is technically
the HTTP server doesn't do access logs and instead it's a separate VUT
that picks up the relevant information from the shmlog to produce
them.
Dridi
Dridi Boukelmoune
2017-05-23 11:12:37 UTC
Permalink
Post by Ryan Burn
I don't see how that could work. The context for the active span needs
to be injected into the headers of any backend requests. The headers
need to be modified from the varnishd process, right? And you can't
inject a span context unless the span has been created, so I would
think a VMOD would have to start the span as well; then since the
OpenTracing API to specify other properties of the span requires
acting on the object returned when you start the span, those would all
need to called from the VMOD.
You can have a VMOD that both populates whatever value you need to
pass through requests and std.log whatever needs to be sent back to
the OpenTracing server. The the VUT can collect that information, plus
Timestamp records and do the sending (it could even send them by
batches if the OT API supports it).

When it comes to header manipulations (including method, status etc)
they are already logged so a VUT can already pick that up and save
some work from the VMOD.

I'm not familiar with OT but what I described is how Zipnish does its
tracing. Except that Zipnish relies on the X-Varnish header to get a
unique-ish id, so no blocking call needs to be made. So maybe they are
highly different systems, I chimed in because I saw a Zipkin example
while briefly skimming through the docs.
Post by Ryan Burn
Also, none of the OpenTracing API functions should block for very
long, if at all.
The nice thing with blocking calls is that you can add "until they
do". And even if they only take a handful of milliseconds, that's
still orders of magnitude slower than regular VCL code. If I
understand correctly only getting an id should be necessary in
your case.
Post by Ryan Burn
The more expensive work such as uploading the tracing
data would happen in a separate thread.
And your VUT can be that separate thread (well in a separate process
too) so that you don't need to care about synchronizing anything, it's
all done by libvarnishapi. And it can work both in "real time" on a
live varnish instance or on a log dump [1] if the OpenTracing API allows
to defer spans submission after the facts (or for testing purposes).

A VUT can survive a crash of the varnishd child, a thread created by a
VMOD can't.

Cheers,
Dridi

[1] see varnishlog -w and -r
Ryan Burn
2017-05-23 21:15:36 UTC
Permalink
Post by Dridi Boukelmoune
When it comes to header manipulations (including method, status etc)
they are already logged so a VUT can already pick that up and save
some work from the VMOD.
It doesn't need to just view the headers; it needs to add headers to encode the
active span context. See
http://opentracing.io/documentation/pages/api/cross-process-tracing.html
And the active span needs to be started before those headers can be added.
Post by Dridi Boukelmoune
I'm not familiar with OT but what I described is how Zipnish does its
tracing. Except that Zipnish relies on the X-Varnish header to get a
unique-ish id, so no blocking call needs to be made. So maybe they are
highly different systems, I chimed in because I saw a Zipkin example
while briefly skimming through the docs.
Zipkin is a distributed tracing system that provides OpenTracing
implementations, but Zipnish is just using it as a monitor for
varnish. It's not doing context propagation. If you're only interested
in monitoring varnish, that's fine; but if you want to see how a
request is processed in an entire distributed system (i.e. not just
varnish, but the backend servers it's sitting in front of or any other
service that might be in front of varnish), then you need to do
context propagation.
Guillaume Quintard
2018-02-06 21:51:16 UTC
Permalink
Sorry to dig that one up, but I stumble upon it again, and it got me
thinking: can't we just put the point to req->err_code in priv? That's not
super clean, but from what I re-read in the code, it should still be there
when it's time to clean the priv.
--
Guillaume Quintard
Post by Ryan Burn
Post by Dridi Boukelmoune
When it comes to header manipulations (including method, status etc)
they are already logged so a VUT can already pick that up and save
some work from the VMOD.
It doesn't need to just view the headers; it needs to add headers to encode the
active span context. See
http://opentracing.io/documentation/pages/api/cross-process-tracing.html
And the active span needs to be started before those headers can be added.
Post by Dridi Boukelmoune
I'm not familiar with OT but what I described is how Zipnish does its
tracing. Except that Zipnish relies on the X-Varnish header to get a
unique-ish id, so no blocking call needs to be made. So maybe they are
highly different systems, I chimed in because I saw a Zipkin example
while briefly skimming through the docs.
Zipkin is a distributed tracing system that provides OpenTracing
implementations, but Zipnish is just using it as a monitor for
varnish. It's not doing context propagation. If you're only interested
in monitoring varnish, that's fine; but if you want to see how a
request is processed in an entire distributed system (i.e. not just
varnish, but the backend servers it's sitting in front of or any other
service that might be in front of varnish), then you need to do
context propagation.
Guillaume Quintard
2017-05-08 16:13:50 UTC
Permalink
That's the way to do it in a vmod, indeed.

However Ryan, I don't get why you are reluctant to use the logs. By using
the c api, you can just define callbacks and get called everything a
request/transaction ends, so you don't need to read regularly.
--
Guillaume Quintard


On May 8, 2017 17:45, "Reza Naghibi" <***@varnish-software.com> wrote:

You can do this in a VMOD via PRIV_TASK:


--
Reza Naghibi
Varnish Software
Post by Ryan Burn
Hello,
From VCL, is it possible to execute code that runs after a request has
been processed?
I'm looking into writing a module that enables Varnish for distributed
tracing using the OpenTracing project [opentracing.io]. This requires
invoking code at the beginning of a request to start a span and insert
tracing context into the request's headers and invoking code after a
request's been processed to finish the span and measure how long it
took to process.
I recently did a similar project for nginx
[github.com/rnburn/nginx-opentracing]. Nginx provides an
NGX_HTTP_LOG_PHASE [www.nginxguts.com/2011/01/phases/] that allows you
to set up handlers run after requests are serviced. Can anything
equivalent be done using VCL?
I image you could accomplish this by subscribing and regularly reading
from Varnish's shared memory log, but I'd much rather do it directly
if possible.
Thanks, Ryan
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Ryan Burn
2017-05-08 23:08:56 UTC
Permalink
Thanks Reza and Guillaume. I didn't realize there was a way to set up
callbacks on the VSM. I think either of the approaches will work for
me.

On Mon, May 8, 2017 at 12:13 PM, Guillaume Quintard
Post by Guillaume Quintard
That's the way to do it in a vmod, indeed.
However Ryan, I don't get why you are reluctant to use the logs. By using
the c api, you can just define callbacks and get called everything a
request/transaction ends, so you don't need to read regularly.
--
Guillaume Quintard
--
Reza Naghibi
Varnish Software
Post by Ryan Burn
Hello,
From VCL, is it possible to execute code that runs after a request has
been processed?
I'm looking into writing a module that enables Varnish for distributed
tracing using the OpenTracing project [opentracing.io]. This requires
invoking code at the beginning of a request to start a span and insert
tracing context into the request's headers and invoking code after a
request's been processed to finish the span and measure how long it
took to process.
I recently did a similar project for nginx
[github.com/rnburn/nginx-opentracing]. Nginx provides an
NGX_HTTP_LOG_PHASE [www.nginxguts.com/2011/01/phases/] that allows you
to set up handlers run after requests are serviced. Can anything
equivalent be done using VCL?
I image you could accomplish this by subscribing and regularly reading
from Varnish's shared memory log, but I'd much rather do it directly
if possible.
Thanks, Ryan
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Guillaume Quintard
2017-05-09 07:40:42 UTC
Permalink
Have a look at varnishncsa and varnishlog, and more specifically to the
function they set VUT.dispatch_f to, that should put you on the right
tracks. If not, ping me on IRC, or here.
--
Guillaume Quintard
Post by Ryan Burn
Thanks Reza and Guillaume. I didn't realize there was a way to set up
callbacks on the VSM. I think either of the approaches will work for
me.
On Mon, May 8, 2017 at 12:13 PM, Guillaume Quintard
Post by Guillaume Quintard
That's the way to do it in a vmod, indeed.
However Ryan, I don't get why you are reluctant to use the logs. By using
the c api, you can just define callbacks and get called everything a
request/transaction ends, so you don't need to read regularly.
--
Guillaume Quintard
--
Reza Naghibi
Varnish Software
Post by Ryan Burn
Hello,
From VCL, is it possible to execute code that runs after a request has
been processed?
I'm looking into writing a module that enables Varnish for distributed
tracing using the OpenTracing project [opentracing.io]. This requires
invoking code at the beginning of a request to start a span and insert
tracing context into the request's headers and invoking code after a
request's been processed to finish the span and measure how long it
took to process.
I recently did a similar project for nginx
[github.com/rnburn/nginx-opentracing]. Nginx provides an
NGX_HTTP_LOG_PHASE [www.nginxguts.com/2011/01/phases/] that allows you
to set up handlers run after requests are serviced. Can anything
equivalent be done using VCL?
I image you could accomplish this by subscribing and regularly reading
from Varnish's shared memory log, but I'd much rather do it directly
if possible.
Thanks, Ryan
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Dridi Boukelmoune
2017-05-09 10:13:58 UTC
Permalink
On Tue, May 9, 2017 at 9:40 AM, Guillaume Quintard
Post by Guillaume Quintard
Have a look at varnishncsa and varnishlog, and more specifically to the
function they set VUT.dispatch_f to, that should put you on the right
tracks. If not, ping me on IRC, or here.
The shared memory log is the best place to do that, however the VUT
(Varnish UTility) API is not yet public [1] so for now you have to
carry a copy of the headers in your project. For Varnish 5.1.2+ I
started a tutorial-as-a-repo [2] although progress on the tutorial
side is around 0% at the moment. You still get a working project for a
VUT (and you can get rid of the VMOD or VCL stuff if you don't need
them).

You can probably find libvarnishapi bindings for other languages but
that's not something we support.

Dridi

[1] https://github.com/varnishcache/varnish-cache/pull/2314
[2] https://github.com/Dridi/varnish-template
Ryan Burn
2017-05-12 00:48:49 UTC
Permalink
Is it possible to set up a VUT.dispatch_f function in a varnish
module? The OpenTracing API for managing a span requires operating on
a single object; and since I need to modify varnish backend requests
to inject tracing context into their headers, I would think it has to
run from a module.

Are the functions VUT_Init, VUT_Setup, VUT_Main, etc allowed to be
invoked from a module, or are they only meant to work in a stand-alone
process?

On Tue, May 9, 2017 at 3:40 AM, Guillaume Quintard
Post by Guillaume Quintard
Have a look at varnishncsa and varnishlog, and more specifically to the
function they set VUT.dispatch_f to, that should put you on the right
tracks. If not, ping me on IRC, or here.
--
Guillaume Quintard
Post by Ryan Burn
Thanks Reza and Guillaume. I didn't realize there was a way to set up
callbacks on the VSM. I think either of the approaches will work for
me.
On Mon, May 8, 2017 at 12:13 PM, Guillaume Quintard
Post by Guillaume Quintard
That's the way to do it in a vmod, indeed.
However Ryan, I don't get why you are reluctant to use the logs. By using
the c api, you can just define callbacks and get called everything a
request/transaction ends, so you don't need to read regularly.
--
Guillaume Quintard
--
Reza Naghibi
Varnish Software
Post by Ryan Burn
Hello,
From VCL, is it possible to execute code that runs after a request has
been processed?
I'm looking into writing a module that enables Varnish for distributed
tracing using the OpenTracing project [opentracing.io]. This requires
invoking code at the beginning of a request to start a span and insert
tracing context into the request's headers and invoking code after a
request's been processed to finish the span and measure how long it
took to process.
I recently did a similar project for nginx
[github.com/rnburn/nginx-opentracing]. Nginx provides an
NGX_HTTP_LOG_PHASE [www.nginxguts.com/2011/01/phases/] that allows you
to set up handlers run after requests are serviced. Can anything
equivalent be done using VCL?
I image you could accomplish this by subscribing and regularly reading
from Varnish's shared memory log, but I'd much rather do it directly
if possible.
Thanks, Ryan
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
Dridi Boukelmoune
2017-05-09 10:17:10 UTC
Permalink
Post by Ryan Burn
I'm looking into writing a module that enables Varnish for distributed
tracing using the OpenTracing project [opentracing.io]. This requires
invoking code at the beginning of a request to start a span and insert
tracing context into the request's headers and invoking code after a
request's been processed to finish the span and measure how long it
took to process.
You can see this kind of tracing in action in the Zipnish [1] project.
It's Varnish's equivalent to Zipkin and if I'm not wrong was or is
compatible with the latter.

Dridi

[1] https://github.com/varnish/zipnish
Loading...