Hidden dangers of mutation

Introduction

Hi, this is Trueman, an Application Engineer working at Rakuten Group's Branch. I've written some articles before about some interesting and useful things we've done here that you might also want to try for yourself. This time I'd like to share a story about unfortunate bug we investigated and fixed that you definitely want to avoid recreating (or at least avoid pushing it to production). Note that although the code I'm sharing here is based on a real problem that we've had, it's also hugely simplified and anonymized to make the problem easier to see and hide any potential confidential information.

Context

In one of our web applications, we have a form that we save in our session. In order to show a before-after view of the form changes, we keep both an old and a new version of this form in session.

Java

  1.   public class SessionData {
  2.     public CustomForm oldForm;
  3.     public CustomForm newForm;
  4.   }

The flow of this session data looks something like this

The flow might seem kind of complicated but the real trouble is really just in the POST request side. The full flow is shown above just to show you how we can get sort of a looping effect which is part of what causes the bug. Here's a more simplified version of the flow to consider:

 

Here are some code snippets that kind of show you how this was initially implemented (again, this has been heavily edited to only show the problem):

POST Endpoint

Java

  1. @PostMapping
  2.   public Response check(HttpSession httpSession, HttpServletRequest request) {
  3.     SessionData sessionData = getDataFromSession(httpSession);
  4.     UserInput userInput = getUserInput(request);
  5.  
  6.     CustomForm form = new CustomForm(userInput, sessionData.oldForm);
  7.     
  8.     sessionData.newForm = form;
  9.     saveToSession(sessionData, httpSession);
  10.  
  11.     if (form.isInvalid) {
  12.       return redirectToFormPage();
  13.     } else {
  14.       return redirectToConfirmPage();
  15.     }
  16.   }

CustomForm

Java

  1. public class CustomForm {
  2.   public List<Integer> listA = new ArrayList<>();
  3.   public List<Integer> listB = new ArrayList<>();
  4.   public UserInput userInput;
  5.   public boolean isInvalid;
  6.  
  7.   public CustomForm(UserInput userInput, CustomForm previousForm) {
  8.     this.userInput = userInput;
  9.     this.listA = previousForm.listA;
  10.     this.listB = previousForm.listB;
  11.     validate();
  12.   }
  13.  
  14.   public void validate() {
  15.     if (isInvalid(userInput)) {
  16.       listA.addAll(listB);
  17.       listB = new ArrayList<>();
  18.       isInvalid = true;
  19.     }
  20.  
  21.     ...
  22.  
  23.     isInvalid = false;
  24.   }
  25. }

Basically what we want this code to do is when we receive the post request, we check to see if the user input is valid. If it's invalid, then we want listA to hold all the elements of both listA and listB, and we want listB to become an empty list, and then send this all back to the user. And this code kind of worked just fine... for one iteration of the loop shown above. However, starting from the second loop, the bug becomes apparent. Can you spot it?

Expected Behaviour

Let's look at what we want our code to do for two iterations, assuming the user submits an invalid input every time.

POST Endpoint

Java

  1. //First Iteration
  2. //Data from POST request
  3. sessionData.oldForm.listA = [1, 2, 3]
  4. sessionData.oldForm.listB = [4, 5]
  5. sessionData.newForm.listA = [1, 2, 3]
  6. sessionData.newForm.listB = [4, 5]
  7.  
  8. //After Creating new custom form, validating, and saving to session
  9. sessionData.oldForm.listA = [1, 2, 3]
  10. sessionData.oldForm.listB = [4, 5]
  11. sessionData.newForm.listA = [1, 2, 3, 4, 5]
  12. sessionData.newForm.listB = [ ]
  13.  
  14. //Second Iteration
  15. //Data from POST request
  16. sessionData.oldForm.listA = [1, 2, 3]
  17. sessionData.oldForm.listB = [4, 5]
  18. sessionData.newForm.listA = [1, 2, 3, 4, 5]
  19. sessionData.newForm.listB = [ ]
  20.  
  21. //After Creating new custom form, validating, and saving to session
  22. sessionData.oldForm.listA = [1, 2, 3]
  23. sessionData.oldForm.listB = [4, 5]
  24. sessionData.newForm.listA = [1, 2, 3, 4, 5]
  25. sessionData.newForm.listB = [ ]

Actual Behaviour

Here's what we were actually seeing instead

POST Endpoint

Java

  1. //First Iteration
  2. //Data from POST request
  3. sessionData.oldForm.listA = [1, 2, 3]
  4. sessionData.oldForm.listB = [4, 5]
  5. sessionData.newForm.listA = [1, 2, 3]
  6. sessionData.newForm.listB = [4, 5]
  7.  
  8. //After Creating new custom form, validating, and saving to session
  9. sessionData.oldForm.listA = [1, 2, 3, 4, 5]
  10. sessionData.oldForm.listB = [4, 5]
  11. sessionData.newForm.listA = [1, 2, 3, 4, 5]
  12. sessionData.newForm.listB = [ ]
  13.  
  14. //Second Iteration
  15. //Data from POST request
  16. sessionData.oldForm.listA = [1, 2, 3, 4, 5]
  17. sessionData.oldForm.listB = [4, 5]
  18. sessionData.newForm.listA = [1, 2, 3, 4, 5]
  19. sessionData.newForm.listB = [ ]
  20.  
  21. //After Creating new custom form, validating, and saving to session
  22. sessionData.oldForm.listA = [1, 2, 3, 4, 5, 4, 5]
  23. sessionData.oldForm.listB = [4, 5]
  24. sessionData.newForm.listA = [1, 2, 3, 4, 5, 4, 5]
  25. sessionData.newForm.listB = [ ]

Maybe seeing what's happening with the data here will give you a better idea of what went wrong. Try going back to the code to figure out what went wrong.

Problem

The root cause of this bug is due to a combination of two coding choices and a small but important detail about how Java handles argument passing in method calls:

  1. Java passes arguments by value but the value being passed for objects is the object's pointer.
  2. Setting our listA and listB properties equal to references of the listA and listB of the previous form.
  3. Mutating listA.

Let's take a closer look at each of these elements.

Java passes arguments by value but...

When we pass an argument in a method call in Java, we are passing that argument by value, which means we pass in a copy of that argument. However, something to note is that when we call a method and pass in an object as an argument, we're not passing in a copy of the object itself but a copy of the pointer to the original object. This means if you modify the object inside your method, you're modifying the same object that the caller might still be working with later. There's a lot of information out there already about the nuances of this (here's one place with some very good explanations: https://stackoverflow.com/questions/40480/is-java-pass-by-reference-or-pass-by-value) so I won't go over all the details.

What this means for us though is that in

POST Endpoint

Java

  1.   @PostMapping
  2.   public Response check(HttpSession httpSession, HttpServletRequest request) {
  3.     SessionData sessionData = getDataFromSession(httpSession);
  4.     UserInput userInput = getUserInput(request);
  5.  
  6.     CustomForm form = new CustomForm(userInput, sessionData.oldForm);
  7.      
  8.     sessionData.newForm = form;
  9.     saveToSession(sessionData, httpSession);
  10.     ...

After we mutate sessionData.oldForm inside the new CustomForm() constructor, we are saving the mutated sessionData.oldForm in our session.

This behaviour on its own isn't really problematic and it's actually a feature of Java that's often used for object oriented programing. However, it can lead to unexpected results if we're not careful.

Setting List property by reference

Let's take a look at the code again where this happens:

CustomForm

Java

  1. public class CustomForm {
  2. ...
  3.  
  4.   public CustomForm(UserInput userInput, CustomForm previousForm) {
  5.     this.userInput = userInput;
  6.     this.listA = previousForm.listA;
  7.     this.listB = previousForm.listB;
  8.     validate();
  9.   }
  10.  
  11.   public void validate() {
  12.     if (isInvalid(userInput)) {
  13.       listA.addAll(listB);
  14.       listB = new ArrayList<>();
  15. ...

When we set 

Java

  1. this.listA = previousForm.ListA;

What we're doing is actually setting this.listA to point to the same list as previousForm.ListA (instead of giving this.listA a copy of previousForm.ListA). This means that any changes we make to this.listA later, also affects previousForm.listA. Here's an image of what this looks like after setting the listA and listB properties.

This isn't necessarily a problem if we don't mutate this.listA... except we do.

Mutating listA

When we want listA to take on the elements of both listA and listB, instead of reassigning this.listA to point to a new list of both elements (which would then leave previousForm.ListA untouched), we instead mutate the object to which this.listA is pointing. We do, however, reassign listB to point to a new empty list.

Java

  1. listA.addAll(listB);
  2. listB = new ArrayList<>();

This is what the state of our objects after these two lines:

Just looking at the picture you can already see that something doesn't look right. The problem fully becomes apparent when we consider the overall flow.

In order to see how this all comes together to create our bug, we need to take a look at the caller of the CustomForm constructor:

POST Endpoint

Java

  1.   @PostMapping
  2.   public Response check(HttpSession httpSession, HttpServletRequest request) {
  3.     SessionData sessionData = getDataFromSession(httpSession);
  4.     UserInput userInput = getUserInput(request);
  5.  
  6.     CustomForm form = new CustomForm(userInput, sessionData.oldForm);
  7.     
  8.     sessionData.newForm = form;
  9.     saveToSession(sessionData, httpSession);
  10.  
  11.     if (form.isInvalid) {
  12.       return redirectToFormPage();
  13.     } else {
  14.       return redirectToConfirmPage();
  15.     }
  16.   }

Notice how we create our form from sessionData.oldForm, before assigning the new form we created to sessionData.newForm. Because of the first two points from earlier, by the time we save our session data to session, sessionData.oldForm has already been unexpectedly modified

At first, we don't notice the problem because when we get redirected to the GET request, we load the page only showing the newForm. However, if we try the POST request again, then this happens:

 

At this point when we redirect to the GET endpoint and try to display the newForm with the incorrect listA, our application fails with a 500 error.

Solution

After understanding the problem, finding the solution is quite simple. Take away at least one of the two components of the problem:

  1. Setting our listA and listB properties equal to references of the listA and listB of the previous form.
  2. Mutating listA.

Ideally avoiding mutation would be ideal because of potential unintended consequences like this very problem. However, the actual code was much more complex and involved than the snippets you see here so we decided to keep the mutation as-is to minimize the scope of the change. Instead we simply set listA and listB properties to be equal to copies of the listA and listB of the previous form. This way, when we mutate this.listA, we leave the original previousForm.listA untouched.

From the old code:

CustomForm

Java

  1. public class CustomForm {
  2. ...
  3.  
  4.   public CustomForm(UserInput userInput, CustomForm previousForm) {
  5.     this.userInput = userInput;
  6.     this.listA = previousForm.listA;
  7.     this.listB = previousForm.listB;
  8.     validate();
  9.   }
  10.  
  11.   public void validate() {
  12.     if (isInvalid(userInput)) {
  13.       listA.addAll(listB);
  14.       listB = new ArrayList<>();
  15. ...
  16.  

to this:

CustomForm

Java

  1. public class CustomForm {
  2. ...
  3.  
  4.   public CustomForm(UserInput userInput, CustomForm previousForm) {
  5.     this.userInput = userInput;
  6.     this.listA = new ArrayList<>(previousForm.listA);
  7.     this.listB = new ArrayList<>(previousForm.listB);
  8.     validate();
  9.   }
  10.  
  11.   public void validate() {
  12.     if (isInvalid(userInput)) {
  13.       listA.addAll(listB);
  14.       listB = new ArrayList<>();
  15. ...
  16.  

With this small change, what the validate method's operation looks like this:

As for the session data, the situation now looks like this:

Which is exactly what we want. If the user was to go through another iteration of this, the situation would like like this:

Basically the situation would remain unchanged, as expected.

Conclusion

The takeaway I hope you get from this story is that whenever you are mutating objects in your code, be mindful of what else might be referencing that same object. Sometimes it's safer to make copies of objects instead of making new references to existing ones (But this can potentially have negative performance consequences). I personally prefer the more functional way of writing code with making almost all objects immutable and almost all methods have no side effects. Unfortunately, that might be difficult or impossible when working with legacy codebases so we must keep these potential issues in mind.

Obligatory Plug

Thanks for reading. I hope this will be useful for you. If you’re interested in looking into and solving interesting problems like these, consider applying to join us at Rakuten.