Hello World

Introduction

Hi, this is Trueman from the Osaka Branch. I left Canada to move to Japan and join Rakuten at the end of 2019 as an Application Engineer. I’ve learned a lot since my first little “hello world” script and would like to share some of that with anyone who might be interested.

If you're a developer or someone who's even dabbled just a little bit in programming, I think you probably still remember your first "hello world" program. That first time you're able to "code" something that works, seeing your first program print out the words "hello world"... Maybe right away you began to imagine all the different possibilities. Or maybe you were a little bit underwhelmed and just wondered what all the excitement was about, all you did was print out some text.

Well, this is a story about one of those possibilities. Starting with an actual on-the-job problem, imagining a solution, and then actually building that solution one step at a time, starting from "hello world".

Problem

We were going to launch a new service with new features and new requirements. One of these requirements involved storing and displaying many more images than we do on our existing services. In order to do this, we decided to use DMC (something like Rakuten's equivalent of Amazon S3, an object storage service, but only for images). DMC has a very simple API that allowed us to upload new/updated images, view images, and delete them. It also has separate staging and production servers allowing our STG and PROD environments to connect to their respective servers without interfering with each other. However, there was no downloadable version of DMC that we could run on our own machines for local development/testing.

Initially, we just connected our local application to STG DMC for dev/testing. This worked well enough to finish development in time for service launch. However, we had several concerns with this set-up:

Dead Space Accumulation

Every time we upload an image to DMC from our local testing environments, the image takes some small amount of space in DMC (this in itself is not a problem, and is in fact, the whole point of using DMC). However, every time we reset our local environment's DB tables, we lose the ability to view and delete the image (since the randomly generated key to this image on DMC is stored in the DB). In practice, developers never bother to delete the uploaded images before clearing the DB (and running certain automated tests will complete the upload image→clean DB cycle without giving the developer a chance to clean the image from DMC). This means basically every time an image is uploaded to DMC from a local machine, a little bit of storage space is lost on the DMC server forever.

Local Interference with STG

Our application has a mechanism to prevent generating the same random key for two different images. However, this mechanism works by referencing the existing keys stored in the DB. Since the STG db contains different data from individual developers' local DB, there is a small but non-zero chance that a local machine's randomly generated key might just happen to match one already being used in STG. If this were to happen, the image uploaded to DMC from the local environment would overwrite an existing STG image.

Test Dependency on External Service

We wanted to avoid/minimize the dependencies of our automated tests on external services. With some of our tests dependent on STG DMC, it would be possible for tests to fail because of network issues or if STG DMC went down for maintenance or other problems.

Limitations to Testing

By connecting our application to the STG DMC, we weren't able to fully test how our application would handle exceptional responses from DMC (ie. a 500 or server maintenance response). We could and did write unit tests where we were able to mock these responses but would not be able to test them in full integration tests.

Solution

In order to address the concerns listed above, we decided to build our own local mock DMC mini-application to implement the parts of the DMC api that we needed. The mock DMC solution directly addresses all four concerns:

  1. Dead Space Accumulation: since the mock DMC exists locally where we have full control, we can reset the mock DMC storage space just as easily as resetting our local DB.
  2. Local Interference with STG: by connecting our application to the local mock DMC instead of STG, there's zero interaction with STG environment and no risk of interference.
  3. Test Dependency on External Service: mock DMC runs locally on the developer's machine so our automated tests are now completely independent from STG DMC's server/network status.
  4. Limitations to Testing: since we have full control over mock DMC, we can add our own endpoints to allow tests to configure mock DMC state to whatever we need to test (like an endpoint that puts mock DMC in maintenance mode).

Creating the Mock

So now that we decided on the solution to our problem, it was time to implement it. But where to start? I really wanted to say this is where we start with the classic "hello world" app, but there's actually one other step before that. First thing to do is to check the requirements. That is, the specifications of the DMC API endpoints that we will actually be using and mocking.

Endpoints
PUT /somePath/:service_id/:user_id/:file_key
POST|DELETE /somePath2/:service_id/:user_id(/:file_key)
GET /somePath3/:service_id/:ref_key

Of course, there's a bit more detail to the API specs but this is probably enough to give you an idea of what we want to build. I decided to create the mock using SpringBoot with an in-memory H2 database to store the images. I chose SpringBoot because our team is quite familiar with it but there are definitely other good alternatives as well. I also chose an in-memory H2 db to store the images over some other solution involving disk IO to keep the mock fast for tests. Of course this only works because the application is just a mock used for local testing and not for production use, since obviously storing images in memory doesn't scale well.

Hello World

With all the initial preparation out of the way, we can finally get started. First we follow a simple hello world tutorial to get a working program: https://spring.io/guides/gs/spring-boot/

But for our purposes, we didn't even need to complete the whole tutorial, just finishing it up to the end of "Create a Simple Web Application" already gives us a runnable application with a working GET endpoint we can test.

@RestController
public class HelloController {
 @GetMapping("/")
 public String index() {
 return "Hello World";
 }
}

Adding DB support

Next we add some dependencies and configuration to get our application working with an in-memory H2 database using JPA to make queries: https://www.baeldung.com/spring-boot-h2-database

At this point, we can create a super simple repository, entity, and controller endpoints to confirm our set-up works. Also had to search a little bit for references on how to convert an image to a byte array for storage and back to an image for display: https://www.baeldung.com/spring-mvc-image-media-data

@Repository
interface DmcImageRepository : JpaRepository<DmcImage, String> {}

...
@Entity
class DmcImage(
 @Id
 @Column(name = "ref_key")
 val refKey: String = "",
 
 @Lob
 @Column(name = "image_data")
 val imageData: ByteArray? = null
)

...

 @GetMapping("/testUpload")
 fun testUpload(): String {
 val path = Paths.get("src/main/resources/images/test.jpg")
 val imageAsByteArray = IOUtils.toByteArray(Files.newInputStream(path))
 val testImage = DmcImage("test.jpg", imageAsByteArray)
 dmcImageRepository.save(testImage)
 return "finished"
 }
 @RequestMapping(value = ["/testRetrieve/{refKey}"], method = [RequestMethod.GET])
 fun getImageFromDb(@PathVariable refKey: String): ResponseEntity<ByteArray> {
 val imageAsByteArray = dmcImageRepository.getById("test.jpg").imageData
 return ResponseEntity.ok().contentType(MediaType.IMAGE_JPEG).body(imageAsByteArray)
 }

Incremental development

Once we got a minimal prototype working, we could start coding in functionality one small piece at a time, testing it manually, commit, and start the next piece. Here are some commit logs to show that progress:

04 Mar 2022 - added not-found image error handling
04 Mar 2022 - separated save key from ref key
07 Mar 2022 - separated key to be able to distinguish serviceId
07 Mar 2022 - added delete functionality
07 Mar 2022 - modified upload endpoint to actually allow uploading new image
07 Mar 2022 - changed upload endpoint to return xml

XML serialization

Since DMC's specifies that it returns its response in XML format, we also had to configure our mock to return xml. Since i wasn't able to find a straightforward tutorial with all the information I needed to get it set up, i'll include some code snippets here for your reference:

dependencies {
 ...
 implementation("com.fasterxml.jackson.dataformat:jackson-dataformat-xml:2.13.1")
 ...
}
...
@JacksonXmlRootElement(localName = "response")
@JsonInclude(JsonInclude.Include.NON_EMPTY)
data class PutResponseXml(
 @field:JacksonXmlProperty(localName = "method")
 val method: String = "store",
 ...
 @field:JacksonXmlProperty(localName = "file")
 val file: FileXml? = null
)
@XmlAccessorType(XmlAccessType.NONE)
@JsonInclude(JsonInclude.Include.NON_EMPTY)
data class FileXml(
 @field:JacksonXmlProperty(localName = "file_key")
 val fileKey: String = "",
 ...
)
...

@RequestMapping(value = ["/somePath/{serviceId}/{userId}/{fileKey}"],
 consumes = ["image/*"],
 produces = [MediaType.APPLICATION_XML_VALUE],
 method = [RequestMethod.PUT])
 fun saveImageToDb(@PathVariable serviceId: String,
 @PathVariable userId: String,
 @PathVariable fileKey: String,
 ...
 imageFile: InputStream): ResponseEntity<PutResponseXml> {
...
 val fileXml = getFileXml(fileKey, dmcImage.refKey)
 val responseXml = PutResponseXml(result = "ok", status = "created", serviceId = serviceId, userId = userId, tagId = "", file = fileXml)
 return ResponseEntity.ok().body(responseXml)
 }
 fun getFileXml(fileKey: String, refKey: String): FileXml {
 return FileXml(fileKey = fileKey, refKey = refKey)
 }

More Incremental development

Here are some more commit logs:

07 Mar 2022 - changed upload endpoint to return xml
07 Mar 2022 - distinguished between saving new image vs overwriting old one
07 Mar 2022 - added create and update time to upload endpoint
08 Mar 2022 - changed delete endpoint to also return xml
08 Mar 2022 - added extra endpoint for test setup and rearranged endpoints
08 Mar 2022 - added link to api documentation
08 Mar 2022 - added dockerfile and .dockerignore for use with julie-docker and car-repair-docker

Dockerizing the Mock

At this point, the mock was in a pretty good state, usable for most of our local testing needs. Next, we wanted to put it into its own docker container for two reasons.

The first is that we are already have a docker-compose file to spin up containers for our DB, webservers, other mocks, etc. for local development. So integrating this mock DMC into that docker-compose file will make it really easy for developers to use without needing to also spin up the mock DMC app in IntelliJ or Gradle.

The second reason is that we need to be able to spin up this mock DMC in our jenkins server as well for jenkins-automated-testing on STG/PROD deployment. Again, dockerizing the mock makes this process very straightforward.

Here's the initial Dockerfile (more improvements were made to it later but this was a working starting point)

FROM gradle:jdk8 AS TEMP_BUILD_IMAGE
ENV APP_HOME=/usr/app
WORKDIR $APP_HOME

# only donwnload dependencies first so docker can cache dependencies layer
COPY build.gradle.kts settings.gradle.kts $APP_HOME

# the "|| true" is meant to silently ignore expected failure due to no source code copied at this stage
RUN gradle clean build --no-daemon > /dev/null 2>&1 || true

COPY ./ $APP_HOME
RUN gradle clean build --no-daemon

# actual container
FROM java:openjdk-8-jre-alpine
ENV ARTIFACT_NAME=car-repair-mock-dmc-1.0.0.jar
ENV APP_HOME=/usr/app/
WORKDIR $APP_HOME

COPY --from=TEMP_BUILD_IMAGE $APP_HOME/build/libs/$ARTIFACT_NAME .

EXPOSE 8012
ENTRYPOINT exec java -jar ${ARTIFACT_NAME}

and the docker-compose file

version: '2'
services:
...
 mock_dmc:
 build:
 context: ../car-repair-mock-dmc
 image: mock_dmc
 ports:
 - "8012:8012"

Debugging Dockerized Mock

Small thing but it turns out reading files from disk in a docker container works differently than doing it from IntelliJ so I had to change

import java.nio.file.Files
import java.nio.file.Paths
...
 private fun getNoImage(): ResponseEntity<ByteArray> {
 val path = Paths.get("src/main/resources/images/noImage.png")
 val inputStream: InputStream = Files.newInputStream(path)
 return ResponseEntity.status(HttpStatus.NOT_FOUND).contentType(MediaType.IMAGE_PNG).body(IOUtils.toByteArray(inputStream))
 }

to

import org.springframework.core.io.ClassPathResource
...
 private fun getNoImage(): ResponseEntity<ByteArray> {
 val path = "images/noImage.png"
 val inputStream: InputStream = ClassPathResource(path).inputStream
 return ResponseEntity.status(HttpStatus.NOT_FOUND).contentType(MediaType.IMAGE_PNG).body(IOUtils.toByteArray(inputStream))
 }

While running mock DMC in a container on my local machine worked fine, we were running into some issues running it in our jenkins servers so I temporarily added some extra logging flags to the gradle build command

 \#from
 RUN gradle clean build --no-daemon
 \#to
 RUN gradle clean build --no-daemon --stacktrace --debug --scan

It turns out we needed to add a proxy in order for docker in the jenkins server to download its dependencies through our firewall

RUN gradle clean build -Dhttps.proxyHost=proxy.host.for.download -Dhttps.proxyPort=XXXX --no-daemon > /dev/null 2>&1 || true

A few more bugs to work out

It turns out that the ByteArray data type can only handle so much data for carrying an image around in our application. Since we needed a test that checks whether our app can correctly handle an image upload up to 3MB in size, we had to use the java.sql.Blob data type instead. This required adding an adapter to convert data types on image upload and retrieval

import org.hibernate.engine.jdbc.BlobProxy
...
fun saveImageToDb(@PathVariable serviceId: String,
 ...
 imageFile: InputStream): ResponseEntity<PutResponseXml> {
 ...
 val imageAsByteArray = IOUtils.toByteArray(imageFile)
 val imageAsBlob = BlobProxy.generateProxy(imageAsByteArray)
 ...
 it.imageData = imageAsBlob
 ...
 dmcImageRepository.save(it)
...

fun getImageFromDb(@PathVariable serviceId: String, @PathVariable refKey: String): ResponseEntity<ByteArray> {
 val imageAsBlob = dmcImageRepository.findByServiceIdAndRefKey(serviceId, refKey)?.imageData ?: return getNoImage()
 val blobLength = imageAsBlob.length()
 val imageAsByteArray = imageAsBlob.getBytes(1, blobLength.toInt())
 return ResponseEntity.ok().contentType(MediaType.IMAGE_JPEG).body(imageAsByteArray)
}

Finally, there was a very silly bug that I wrote in from the very beginning, that we didn't realize was a problem until mid-april. Here's the offending line of code, can you spot the problem?

val formatter = DateTimeFormatter.ofPattern("YYYY-MM-DD HH:mm:ss")

it turns out the "DD" in this pattern means day of year, and I should've been using "dd" for day of month.

Everything was working perfectly fine because we didn't really care about the dates in our usage of the mock. But after April 10th, the 100th day of the year, the mock would just throw a 500 error when it tried to fit a 3-digit date into a 2-digit string. Something like this presumably would have been caught if I had been writing unit tests... and if this was an app meant for production, there definitely would have been tests. Thankfully this was an easy fix.

Conclusion

Hopefully this gave you some idea as to what you can build starting from a simple Hello World tutorial. The other things I hoped to show with this story was how to approach a software engineering problem and how to build out your solution one tutorial and debugging at a time. I think a lot of the time in our work, we jump right into a large and mature codebase and get a bit lost and intimidated when we have to build something from scratch by ourselves. But if you break the problem down and aim for the smallest tangible solution first, and slowly build it up piece by piece, you'll be finished before you know it.

Obligatory Plug

Thanks for reading. If you’ve made it this far, maybe you’re interested in getting paid for building solid software instead of just for fun? Or maybe you’d like to come learn and grow with us? If so, we’re hiring so please consider applying.