Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce the number of AtmosphereResponse's inner object #1951

Open
jfarcand opened this issue Apr 20, 2015 · 38 comments
Open

Reduce the number of AtmosphereResponse's inner object #1951

jfarcand opened this issue Apr 20, 2015 · 38 comments

Comments

@jfarcand
Copy link
Member

As noted by @elakito, right now a new short lived gets created when 'getOutputStream' and 'getWriter' gets called. This is a non sense.

@elakito
Copy link
Contributor

elakito commented Apr 21, 2015

thanks!

2015-04-20 20:30 GMT+02:00 Jeanfrancois Arcand [email protected]:

Closed #1951 #1951.


Reply to this email directly or view it on GitHub
#1951 (comment).

@slovdahl
Copy link
Contributor

slovdahl commented Feb 9, 2016

This fix seems to have a huge impact on memory usage (at least in my case). The memory usage seems to have more than doubled!

I did some highly unscientific tests to prove it. I have a wasync based application that creates as many clients as I tell it to do. I started the server and waited 15 seconds before starting to connect 20000 websocket clients. They were all connected after 40-50 seconds. After a client has successfully connected it starts pinging the server with a 30-second interval. The server responds with a pong on every ping. My server application uses one broadcaster per client.

Memory usage after all clients were connected:

  • Atmosphere 2.2.9: 900-1300 MB
  • Atmosphere 2.3.6: 2000-2700 MB
  • Atmosphere 2.3.7-SNAPSHOT with the fix reverted: 1000-1400 MB

Screenshots from VisualVM:

You'll notice the screenshots from 2.2.9 and 2.3.7-SNAPSHOT look very similar.

Environment:

$ java -version
openjdk version "1.8.0_66-internal"
OpenJDK Runtime Environment (build 1.8.0_66-internal-b17)
OpenJDK 64-Bit Server VM (build 25.66-b17, mixed mode)

Command line options:
-XX:+UseG1GC -Xms4g -Xmx4g

@slovdahl
Copy link
Contributor

slovdahl commented Feb 9, 2016

Also tested with java 7:

$ java -version
java version "1.7.0_95"
OpenJDK Runtime Environment (IcedTea 2.6.4) (7u95-2.6.4-0ubuntu0.15.10.1)
OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)

Screenshots from VisualVM:

No real differences when compared to the java 8 tests.

@slovdahl
Copy link
Contributor

slovdahl commented Feb 9, 2016

Screenshots from VisualVM (java 8):

The revert on the master branch didn't of course apply cleanly because AtmosphereResponse has been renamed to AtmosphereResponseImpl, so I did the revert manually by looking at the diff.

@elakito
Copy link
Contributor

elakito commented Feb 10, 2016

@slovdahl thanks for your analysis. I am curious why caching of the output stream object changed the memory usage in that way. It seems to indicate two things:

  1. the getOutputStream is called once (or at lease not many times) in standard scenarios
  2. the written data to the stream is probably held in the cached object and this data get's not released for a longer time or even has some hidden memory leak in there?

@slovdahl
Copy link
Contributor

The heap dump I looked through while investigating this also suggests that the written data is held in the cached object and therefore isn't garbage collected.

But generally I know too little about the inner workings to make a good analysis of it. If you (or @jfarcand) want to take a look at the heap dump I'd be happy to share it with you privately.

@slovdahl
Copy link
Contributor

@elakito do you have any benchmarks that actually prove that the change improves eg. GC rate, CPU usage or memory usage?

@gdrouet
Copy link
Member

gdrouet commented Feb 14, 2016

just to add my 5 cents, I took at look at the code and those two attributes:

private final Stream stream = new Stream();
private final Writer writer = new Writer(stream);

I don't have any fact that data are retained in memory by "Stream" and "Writer" instances but it does not make me comfortable to see those objects strongly referenced after the message is written. They are leveraging buffers and they can start to consume a lot of memory at least if they are not flushed. Are you able to check the internal buffers memory size of those object?

I've also a question: why we don't initialize Writer attribute lazily? I would expect either Stream or Writer to be used in the application lifecycle, but not both. Maybe it's worst to check how the application behaves with this quick change.

slovdahl added a commit to slovdahl/atmosphere that referenced this issue Feb 14, 2016
@elakito
Copy link
Contributor

elakito commented Feb 14, 2016

@slovdahl i don't have any benchmarks. I observed this creation in a small test case.

@grdouet I agree it is unnecessary to instantiate those attributes at the instantiation time. But doing the creation lazily with only a single copy would have required synchronization. But actually it doesn't matter if it is instantiated more than once, so if we had gone for lazy instantiation, no synchronization would have been required.

But in any case, because of the potential gc-blockage through the reference kept from the resource, we should revert this change as @slovdahl has already done.

@slovdahl
Copy link
Contributor

Sounds like a good idea. I can push the reverts to the atmosphere-2.3.x, atmosphere-2.4.x (and atmosphere-2.4.1.x?) and master branches.

slovdahl added a commit that referenced this issue Feb 14, 2016
This reverts commit fec8ec0.
slovdahl added a commit that referenced this issue Feb 14, 2016
This reverts commit fec8ec0.
slovdahl added a commit that referenced this issue Feb 14, 2016
This reverts commit fec8ec0.
@elakito
Copy link
Contributor

elakito commented Feb 14, 2016

@slovdahl branch 2.4.x exited but unused/un-updated for sometime.
I just merged master to 2.4.x and updated master's version to 2.5.0-SNAPSHOT.
that means atmosphere-2.4.x is active now and the next 2.4.3 will be cut from that branch.
thanks.

@jfarcand
Copy link
Member Author

Hum....I'm not sure about the memory gain here to be honest. @slovdahl what are your VM configuration used? Have you try forcing GC?

@jfarcand
Copy link
Member Author

@slovdahl My point with the garbage collection is if it hasn't executed, then true it will take more memory but if you configure the GC to execute more frequently, that issue won't exists. The root cause of this issue is to find which Atmosphere's object is holding a reference, and clear that reference. For example, the destroy method should set the stream + writer to null when called, so the GC can reclaim them more faster.

@jfarcand jfarcand reopened this Feb 15, 2016
@jfarcand
Copy link
Member Author

I'm re-opening the issue as I;m not 100% convinced about rolling back to old implementation, which was not performant at all.

@thabach
Copy link
Member

thabach commented Feb 15, 2016

I think it's not a question of having the GC run more frequently, but as u suggest of null-ing the fields at some point and prevent those two objects from 'hanging around' - being reachable from a GC root, even on not being needed anymore. They became reachable as of becoming fields and thereby not being temporary, on-call created objects any more.

@slovdahl
Copy link
Contributor

@jfarcand I forced manual GCs via VisualVM and the memory usage did not go down to the same level as with the fix reverted.

I'm open to suggestions, but there is no doubt this change doubled the memory usage in my case.

@thabach
Copy link
Member

thabach commented Feb 15, 2016

With this change, like before (but now with non-anonymous inner classes'-) objects are referencing the effective-final asyncWriter of the AtmosphereResponse, but different to before those objects are not short-lived but kept in the newly introduced fields. This leads to the writer and its buffer being retained even after a destroy()-call null-ing out the writer reference in the AtmosphereResponse (the outer object). The writer's reference count after destroy() does not fall to zero as of the inner objects still being reachable through the newly introduced fields (holding references to the writer in their respective captures) which leads to the situation of the garbage collector not collecting the writer with its default 16k buffer till the AtmosphereResponse itself gets garbage collected. This is different to the original implementation where there were no long-lived inner-objects referencing the writer at the time destroy() was called and null-ing the writer reference lead to the writer object being deterministically garbage-collected as a result.

@slovdahl could you run your test with a destroy() as @jfarcand suggested that sets the two references to null and report if this helps ?

@slovdahl
Copy link
Contributor

What should I destroy? And where should I insert it?

@thabach
Copy link
Member

thabach commented Feb 16, 2016

The idea is to add null-ing of the inner-classes fields in AmtosphereResponse::destroy(...) (on line 194) as follows :

   ...
   asyncIOWriter = null;
   stream = null;
   writer = null;
   destroyed.set(true);
}

and thereby make the inner objects become subject of garbage collection. The AtmosphereResponseis the owner of these inner objects and on destroy(...)is holding the last references to them (and transitively to their captures).

@slovdahl
Copy link
Contributor

Ok, I'll try it later today.

@slovdahl
Copy link
Contributor

No difference:

diff --git a/modules/cpr/src/main/java/org/atmosphere/cpr/AtmosphereResponse.java b/modules/cpr/src/main/java/org/atmosphere/cpr/AtmosphereResponse.java
index 39b5257..615b350 100644
--- a/modules/cpr/src/main/java/org/atmosphere/cpr/AtmosphereResponse.java
+++ b/modules/cpr/src/main/java/org/atmosphere/cpr/AtmosphereResponse.java
@@ -99,8 +99,8 @@ public class AtmosphereResponse extends HttpServletResponseWrapper {
     private String uuid = "0";
     private final AtomicBoolean usingStream = new AtomicBoolean(true);
     private final AtomicBoolean destroyed = new AtomicBoolean(false);
-    private final Stream stream = new Stream();
-    private final Writer writer = new Writer(stream);
+    private Stream stream = new Stream();
+    private Writer writer = new Writer(stream);

     public AtmosphereResponse(AsyncIOWriter asyncIOWriter, AtmosphereRequest atmosphereRequest, boolean destroyable) {
         super(dsr);
@@ -211,6 +211,8 @@ public class AtmosphereResponse extends HttpServletResponseWrapper {
         headers.clear();
         atmosphereRequest = null;
         asyncIOWriter = null;
+        stream = null;
+        writer = null;
         destroyed.set(true);
     }

@thabach
Copy link
Member

thabach commented Feb 16, 2016

Interesting, what happens if you remove the two fields and return new Stream and Writer objects every time the respective getters are called ? Only as an intermediary try in getting at a more conclusive analysis. Could u maybe also additionally try that ?

@slovdahl
Copy link
Contributor

That's exactly what the first screenshot in the previous post is:

@thabach
Copy link
Member

thabach commented Feb 16, 2016

Sorry for not being clearer, what I ment was to try with a version that does not revert the change fully but only to remove the fields, but keep the non-anonymous, named inner classes Stream and Writer (and return new instances of these in the respective getters). A partial revert if you wish. That'd be great 👍!

@slovdahl
Copy link
Contributor

Ah, sorry for the misunderstanding. I'll try that!

@slovdahl
Copy link
Contributor

Not really surprising, but nevertheless, applying the change as suggested by @thabach reduces the memory usage back to normal levels: http://i.imgur.com/ok0qNe3.png

@thabach
Copy link
Member

thabach commented Feb 16, 2016

Could u log if ''destroy()'' is called at all in your scenario then ?

With the last test you showed that no one calling the getters keeps holding tight to the reference to inner objects and even more importantly, that the fields hold the crucial references in preventing the inner objects from being collected.

@slovdahl
Copy link
Contributor

Honestly I don't understand why destroy() would be called? The AtmosphereResponse instance referenced from AtmosphereResource won't be destroyed as long as the clients are connected.

@thabach thabach closed this as completed Feb 16, 2016
@thabach thabach reopened this Feb 16, 2016
@thabach
Copy link
Member

thabach commented Feb 16, 2016

Sorry for accidentally closing the issue, we need @jfarcand for explanations on the flow and why he was suggesting the null-ing in destroy. Jean-François could u comment ?

@jfarcand
Copy link
Member Author

It depends on the transport and the server you are using. For example with Nettosphere the destroy method is invoked on every websocket request. Hence that's why the fix reside inside the destroy. Hence nullying those fields should help saving memory. But the GC will still happens later than when we return a new instance for every call. So it's a design choice...I have no problem with @slovdahl improved patch, but I strongly recommend we document it if we return a new instance on every invocation.

@slovdahl
Copy link
Contributor

slovdahl commented Mar 1, 2016

Okay, so let's revert my revert, but instead create new instances of inner Stream and Writer classes on every getOutputStream() and getWriter() call. So essentially the only change done here will be extracting the anonymous Stream and Writer classes to inner classes.

slovdahl added a commit that referenced this issue Mar 1, 2016
slovdahl added a commit that referenced this issue Mar 1, 2016
slovdahl added a commit that referenced this issue Mar 1, 2016
slovdahl added a commit that referenced this issue Mar 1, 2016
slovdahl added a commit to slovdahl/atmosphere that referenced this issue Mar 1, 2016
@slovdahl slovdahl reopened this Mar 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants