Memory utilisation gradually increasing

Discussion:

FULLER, David

2018-07-25 14:19:24 UTC

We currently have an issue with memory utilisation in Varnish 5.2.1, we are only using Reverse Proxy not the caching functionality.

We are running it in an AWS ECS Docker container, with 1GB of memory allocated. Memory increases daily by around 8% until it tops out and site connectivity problems occur. Redeploying the container resolves the problem and the cycle starts again.

When Varnish starts we have âmallocâ set at 100MB, from my understanding this setting is only relevant if caching is being used, which in our case it isnât.

Has anyone seen a similar problem?

Thanks

UK Parliament Disclaimer: This e-mail is confidential to the intended recipient. If you have received it in error, please notify the sender and delete it from your system. Any unauthorised use, disclosure, or copying is not permitted. This e-mail has been checked for viruses, but no liability is accepted for any damage caused by any virus transmitted by this e-mail. This e-mail address is not secure, is not encrypted and should not be used for sensitive data.

Guillaume Quintard

2018-07-25 14:59:55 UTC

Permalink

Hello David,

Have a look at varnishstat ("varnishstat -1 | grep -e g_space -e g_bytes").
When you are passing, varnish is going to consume Transient storage.
--
Guillaume Quintard

Post by FULLER, David
We currently have an issue with memory utilisation in Varnish 5.2.1, we
are only using Reverse Proxy not the caching functionality.
We are running it in an AWS ECS Docker container, with 1GB of memory
allocated. Memory increases daily by around 8% until it tops out and site
connectivity problems occur. Redeploying the container resolves the
problem and the cycle starts again.
When Varnish starts we have âmallocâ set at 100MB, from my understanding
this setting is only relevant if caching is being used, which in our case
it isnât.
Has anyone seen a similar problem?
Thanks
UK Parliament Disclaimer: This e-mail is confidential to the intended
recipient. If you have received it in error, please notify the sender and
delete it from your system. Any unauthorised use, disclosure, or copying is
not permitted. This e-mail has been checked for viruses, but no liability
is accepted for any damage caused by any virus transmitted by this e-mail.
This e-mail address is not secure, is not encrypted and should not be used
for sensitive data.
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc

Guillaume Quintard

2018-07-25 17:30:01 UTC

Permalink

Let's keep the mailing list in CC :-)

http://varnish-cache.org/docs/trunk/users-guide/storage-backends.html#transient-storage

You also have Reza's post:
https://info.varnish-software.com/blog/understanding-varnish-cache-memory-usage

Finally, memory is will also be consumed by workspaces (one per thread).
--
Guillaume Quintard

Hi Guillaume,
Thanks for the response, Iâve run the command youâve suggested and get the
/ # varnishstat -1 | grep -e g_space -e g_bytes
SMA.s0.g_bytes 0 . Bytes
outstanding
SMA.s0.g_space 104857600 . Bytes
available
SMA.Transient.g_bytes 0 . Bytes
outstanding
SMA.Transient.g_space 0 . Bytes
available
The Varnish container was redeployed this afternoon and currently shows
memory utilisation around 3% so probably not illustrating the problem very
well right now.
Is there a way to limit the amount of transient storage and clear when hit
without effecting performance? Given that we arenât caching are there any
other settings we should look at to improve memory utilisation?
Kind regards,
David
*Date: *Wednesday, 25 July 2018 at 16:00
*Subject: *Re: Memory utilisation gradually increasing
Hello David,
Have a look at varnishstat ("varnishstat -1 | grep -e g_space -e
g_bytes"). When you are passing, varnish is going to consume Transient
storage.
--
Guillaume Quintard
We currently have an issue with memory utilisation in Varnish 5.2.1, we
are only using Reverse Proxy not the caching functionality.
We are running it in an AWS ECS Docker container, with 1GB of memory
allocated. Memory increases daily by around 8% until it tops out and site
connectivity problems occur. Redeploying the container resolves the
problem and the cycle starts again.
When Varnish starts we have âmallocâ set at 100MB, from my understanding
this setting is only relevant if caching is being used, which in our case
it isnât.
Has anyone seen a similar problem?
Thanks
UK Parliament Disclaimer: This e-mail is confidential to the intended
recipient. If you have received it in error, please notify the sender and
delete it from your system. Any unauthorised use, disclosure, or copying is
not permitted. This e-mail has been checked for viruses, but no liability
is accepted for any damage caused by any virus transmitted by this e-mail.
This e-mail address is not secure, is not encrypted and should not be used
for sensitive data.
_______________________________________________
varnish-misc mailing list
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc
UK Parliament Disclaimer: This e-mail is confidential to the intended
recipient. If you have received it in error, please notify the sender and
delete it from your system. Any unauthorised use, disclosure, or copying is
not permitted. This e-mail has been checked for viruses, but no liability
is accepted for any damage caused by any virus transmitted by this e-mail.
This e-mail address is not secure, is not encrypted and should not be used
for sensitive data.

FULLER, David

2018-07-26 15:05:22 UTC

Permalink

Thanks for your reply.

Am I correct in thinking that even though weâre not using Varnish for caching, objects are still stored using malloc? We have malloc set at 100MB, with 1GB allocated to the Varnish container. Based on the docs youâve linked to should we set malloc at 750MB (75% of the container memory) and could this be the cause of the memory problem weâre seeing?

Regards,
David

From: Guillaume Quintard <***@varnish-software.com>
Date: Wednesday, 25 July 2018 at 18:30
To: "FULLER, David" <***@parliament.uk>, varnish-misc <varnish-***@varnish-cache.org>
Subject: Re: Memory utilisation gradually increasing

Let's keep the mailing list in CC :-)

http://varnish-cache.org/docs/trunk/users-guide/storage-backends.html#transient-storage<http://varnish-cache.org/docs/trunk/users-guide/storage-backends.html#transient-storage>

You also have Reza's post: https://info.varnish-software.com/blog/understanding-varnish-cache-memory-usage<https://info.varnish-software.com/blog/understanding-varnish-cache-memory-usage>

Finally, memory is will also be consumed by workspaces (one per thread).
--
Guillaume Quintard

On Wed, Jul 25, 2018 at 8:32 AM, FULLER, David <***@parliament.uk<mailto:***@parliament.uk>> wrote:
Hi Guillaume,

Thanks for the response, Iâve run the command youâve suggested and get the following:

/ # varnishstat -1 | grep -e g_space -e g_bytes
SMA.s0.g_bytes 0 . Bytes outstanding
SMA.s0.g_space 104857600 . Bytes available
SMA.Transient.g_bytes 0 . Bytes outstanding
SMA.Transient.g_space 0 . Bytes available

The Varnish container was redeployed this afternoon and currently shows memory utilisation around 3% so probably not illustrating the problem very well right now.

Is there a way to limit the amount of transient storage and clear when hit without effecting performance? Given that we arenât caching are there any other settings we should look at to improve memory utilisation?

Kind regards,
David

From: Guillaume Quintard <***@varnish-software.com<mailto:***@varnish-software.com>>
Date: Wednesday, 25 July 2018 at 16:00
To: "FULLER, David" <***@parliament.uk<mailto:***@parliament.uk>>
Cc: "varnish-***@varnish-cache.org<mailto:varnish-***@varnish-cache.org>" <varnish-***@varnish-cache.org<mailto:varnish-***@varnish-cache.org>>
Subject: Re: Memory utilisation gradually increasing

Hello David,

Have a look at varnishstat ("varnishstat -1 | grep -e g_space -e g_bytes"). When you are passing, varnish is going to consume Transient storage.
--
Guillaume Quintard

On Wed, Jul 25, 2018 at 7:19 AM, FULLER, David <***@parliament.uk<mailto:***@parliament.uk>> wrote:
We currently have an issue with memory utilisation in Varnish 5.2.1, we are only using Reverse Proxy not the caching functionality.

We are running it in an AWS ECS Docker container, with 1GB of memory allocated. Memory increases daily by around 8% until it tops out and site connectivity problems occur. Redeploying the container resolves the problem and the cycle starts again.

When Varnish starts we have âmallocâ set at 100MB, from my understanding this setting is only relevant if caching is being used, which in our case it isnât.

Has anyone seen a similar problem?

Thanks

UK Parliament Disclaimer: This e-mail is confidential to the intended recipient. If you have received it in error, please notify the sender and delete it from your system. Any unauthorised use, disclosure, or copying is not permitted. This e-mail has been checked for viruses, but no liability is accepted for any damage caused by any virus transmitted by this e-mail. This e-mail address is not secure, is not encrypted and should not be used for sensitive data.

_______________________________________________
varnish-misc mailing list
varnish-***@varnish-cache.org<mailto:varnish-***@varnish-cache.org>
https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc<https://www.varnish-cache.org/lists/mailman/listinfo/varnish-misc>

UK Parliament Disclaimer: This e-mail is confidential to the intended recipient. If you have received it in error, please notify the sender and delete it from your system. Any unauthorised use, disclosure, or copying is not permitted. This e-mail has been checked for viruses, but no liability is accepted for any damage caused by any virus transmitted by this e-mail. This e-mail address is not secure, is not encrypted and should not be used for sensitive data.

UK Parliament Disclaimer: This e-mail is confidential to the intended recipient. If you have received it in error, please notify the sender and delete it from your system. Any unauthorised use, disclosure, or copying is not permitted. This e-mail has been checked for viruses, but no liability is accepted for any damage caused by any virus transmitted by this e-mail. This e-mail address is not secure, is not encrypted and should not be used for sensitive data.

Dridi Boukelmoune

2018-07-26 15:25:45 UTC

Permalink

Post by FULLER, David
Thanks for your reply.
Am I correct in thinking that even though we’re not using Varnish for
caching, objects are still stored using malloc? We have malloc set at
100MB, with 1GB allocated to the Varnish container. Based on the docs
you’ve linked to should we set malloc at 750MB (75% of the container memory)
and could this be the cause of the memory problem we’re seeing?

FULLER, David

2018-07-27 13:04:47 UTC

Permalink

Hi Dridi,

Thank you for your response, we do not have any cron jobs or schedules set up. Is there a way to check the number of VCLs currently loaded?

thanks

Are you reloading VCLs in your container? Loaded VCLs contribute to
the memory footprint but that's usually negligible. However we've seen
setups where cron jobs or similar means would schedule reloads on a
regular basis whether or not the VCL actually changed on disk, leading
to thousands of loaded VCL. And that could look like a leak with the
memory footprint gradually increasing.

Dridi

UK Parliament Disclaimer: This e-mail is confidential to the intended recipient. If you have received it in error, please notify the sender and delete it from your system. Any unauthorised use, disclosure, or copying is not permitted. This e-mail has been checked for viruses, but no liability is accepted for any damage caused by any virus transmitted by this e-mail. This e-mail address is not secure, is not encrypted and should not be used for sensitive data.

Dridi Boukelmoune

2018-07-27 13:36:03 UTC

Permalink

Post by FULLER, David
Hi Dridi,
Thank you for your response, we do not have any cron jobs or schedules set up. Is there a way to check the number of VCLs currently loaded?

Something like varnishadm vcl.list | wc -l

FULLER, David

2018-07-27 14:41:12 UTC

Permalink

Thanks, we have 1 active VCL and around 35 available (of these 3 are auto/warm and the remainder cold/cold). The containers running Varnish had to be rebuilt a couple of hours ago, so I'll check varnishadm again on Monday to see whether the numbers have increased. Are the cold/cold VCLs considered loaded in terms of memory usage?

Post by FULLER, David
Hi Dridi,
Thank you for your response, we do not have any cron jobs or schedules set up. Is there a way to check the number of VCLs currently loaded?

Something like varnishadm vcl.list | wc -l

UK Parliament Disclaimer: This e-mail is confidential to the intended recipient. If you have received it in error, please notify the sender and delete it from your system. Any unauthorised use, disclosure, or copying is not permitted. This e-mail has been checked for viruses, but no liability is accepted for any damage caused by any virus transmitted by this e-mail. This e-mail address is not secure, is not encrypted and should not be used for sensitive data.

Dridi Boukelmoune

2018-07-28 17:56:07 UTC

Permalink

Post by FULLER, David
Thanks, we have 1 active VCL and around 35 available (of these 3 are auto/warm and the remainder cold/cold). The containers running Varnish had to be rebuilt a couple of hours ago, so I'll check varnishadm again on Monday to see whether the numbers have increased. Are the cold/cold VCLs considered loaded in terms of memory usage?

FULLER, David

2018-07-30 12:50:40 UTC

Permalink

Thanks Dridi,

After further investigation I found that we do have a python/cron job running that checks for backend changes and if so does a vcl.load. This has resulted in a growing number of VCLs being loaded, as you suspected. After a redeploying the Varnish container this morning and then monitoring varnishadm we have gone from around 30 VCLs loaded to over 200 in a few hours. Is there a way to limit the number of VCLs that can be loaded, with older ones being dropped as new ones are loaded?

Yes, cold VCLs are considered loaded in terms of memory usage,
but in a cold state.

It means that they have a lower footprint, but an overhead still
exists. To lower the footprint, varnishd and (well-behaved) VMODs
release any resources that can be acquired again once the VCL is
warmed up prior its use. Just telling varnish to `vcl.use` a cold VCL
will automatically go through the warm up phase, and set the VCL as
active once it reaches the warm state.

Dridi

UK Parliament Disclaimer: This e-mail is confidential to the intended recipient. If you have received it in error, please notify the sender and delete it from your system. Any unauthorised use, disclosure, or copying is not permitted. This e-mail has been checked for viruses, but no liability is accepted for any damage caused by any virus transmitted by this e-mail. This e-mail address is not secure, is not encrypted and should not be used for sensitive data.

Dridi Boukelmoune

2018-07-31 09:03:01 UTC

Permalink

Post by FULLER, David
Thanks Dridi,
After further investigation I found that we do have a python/cron job running that checks for backend changes and if so does a vcl.load. This has resulted in a growing number of VCLs being loaded, as you suspected. After a redeploying the Varnish container this morning and then monitoring varnishadm we have gone from around 30 VCLs loaded to over 200 in a few hours. Is there a way to limit the number of VCLs that can be loaded, with older ones being dropped as new ones are loaded?

For Varnish 6.0 we released a new varnishreload script, see its usage:

https://github.com/varnishcache/pkg-varnish-cache/blob/0ad2f22629c4a368959c423a19e352c9c6c79682/systemd/varnishreload#L46-L74

The main goals were to unify the reload script on all the platforms we
support and simplify the logic by only supporting the privileged
"service manager" use case. As a result it was also easy to add the -m
option to discard old reloads if have more than "maximum" reload VCLs.

In other words, it's up to the python script scheduled by your cron
job to keep track of reloads and discard older ones according to your
policy.

Dridi

FULLER, David

2018-08-01 11:05:25 UTC

Permalink

Thatâs great, thanks. Does the first VCL listed (labelled boot in column 4 â âavailable cold/cold 0 bootâ) need to stay loaded and is there a recommended amount of VCLs to keep loaded?
From: Dridi Boukelmoune <***@varni.sh>
Date: Tuesday, 31 July 2018 at 10:03
To: "FULLER, David" <***@parliament.uk>
Cc: Guillaume Quintard <***@varnish-software.com>, varnish-misc <varnish-***@varnish-cache.org>
Subject: Re: Memory utilisation gradually increasing

For Varnish 6.0 we released a new varnishreload script, see its usage:

https://github.com/varnishcache/pkg-varnish-cache/blob/0ad2f22629c4a368959c423a19e352c9c6c79682/systemd/varnishreload#L46-L74<https://github.com/varnishcache/pkg-varnish-cache/blob/0ad2f22629c4a368959c423a19e352c9c6c79682/systemd/varnishreload#L46-L74>

The main goals were to unify the reload script on all the platforms we
support and simplify the logic by only supporting the privileged
"service manager" use case. As a result it was also easy to add the -m
option to discard old reloads if have more than "maximum" reload VCLs.

In other words, it's up to the python script scheduled by your cron
job to keep track of reloads and discard older ones according to your
policy.

Dridi

UK Parliament Disclaimer: This e-mail is confidential to the intended recipient. If you have received it in error, please notify the sender and delete it from your system. Any unauthorised use, disclosure, or copying is not permitted. This e-mail has been checked for viruses, but no liability is accepted for any damage caused by any virus transmitted by this e-mail. This e-mail address is not secure, is not encrypted and should not be used for sensitive data.

Dridi Boukelmoune

2018-08-02 10:19:43 UTC

Permalink

That’s great, thanks. Does the first VCL listed (labelled boot in column 4
– ‘available cold/cold 0 boot’) need to stay loaded and is there
a recommended amount of VCLs to keep loaded?

There's nothing special about "boot", it's just the VCL name given to
the one loaded at launch time using the -f or -b option.

For the amount of VCLs to keep loaded, I'd say enough to feel
confident about rollbacks. If it takes you 30m to link a problem to a
VCL change, then keep 1h worth of reloads for example. Again, your
policy, your rules :)

Cheers