Treasure Data:
At Treasure Data, were on a mission to radically simplify how companies use data to create connected customer experiences. Our sophisticated cloud-based customer data platform drives operational efficiency across the enterprise to deliver powerful business outcomes in a way thats safe, flexible, and secure.
We are thrilled that Gartner Magic Quadrant has recognized Treasure Data as a Leader in Customer Data Platforms for 2024! It's an honor to be acknowledged for our efforts in advancing the CDP industry with cutting-edge AI and real-time capabilities. View the report here.
Furthermore, Treasure Data employees are enthusiastic, data-driven, and customer-obsessed. We are a team of driversself-starters who take initiative, anticipate needs, and proactively jump in to solve problems. Our actions reflect our values of honesty, reliability, openness, and humility.
Your Role:
The Plazma team at Treasure Data is one of the essential elements of our CDP solution and is part of the Core Services group, which supports customer data ingestion and availability at a rate of 70B records per day. We develop & run the storage and query engine components and enable customers to find and store their data by offering comprehensive solutions based on OSS and proprietary software. You are expected to help the team develop the future of our Hadoop/Hive & Trino query engines and expand from there into our in-house developed storage solution. This includes maintaining technical excellence to address challenges that currently lack industry-wide solutions and delivering the roadmap together with your team. Our team consists of Big Data experts across Japan, Korea and Canada who are passionate about OSS contribution, and we take pride in the quality of service we offer.
Responsibilities & Duties:
- Work as a member of the team by designing and developing Trino & Hadoop/Hive solutions
- Be responsible for providing solution expertise around Trino & Hadoop/Hive technologies. This includes technology assessment, use case development, as well as solution outline and design for modern data architectures
- Establish standards and guidelines for the design & development, tuning, deployment, and maintenance of advanced data access frameworks and distributed systems
- Document architectural and technology advancements
- Work with your team to set up the roadmap for Trino & Hadoop/Hive related products based on operational needs and customer requested features
- Mentor and train new members in the team
- Version and release management of Trino and Hive products
- Evaluate, test, and set a base version
- Back port any needed patches from trunk, which contains the latest cutting-edge version of the project, but therefore may also be the most unstable version
- Deploy new customer-facing features for Trino & Hadoop/Hive
- Coordinate with support and product teams on product releases
- Make contributions to the open source community
- Contribute to Trino & Hadoop/Hive open source community with bug fixes and new features
- Work with the Site Reliability team to automate Trino & Hadoop/Hive cluster operations to reduce operational overhead
- Design, develop, and evaluate metrics to ensure system health and plan infrastructure capacity of clusters
- Design and develop scripts to automatically start and stop clusters and switch traffic between active clusters for load balancing of customers workloads
- Design and develop failure recovery tools to automatically detect the occurrence of faults and recover faulty clusters
- Provide in-depth support services to Trino & Hadoop/Hive customers
- Take responsibility for on-call to support Trino & Hadoop/Hive customers
- Deal with escalations on product defects and performance issues, lead and perform in depth troubleshooting of Trino & Hadoop/Hive related systems
- Design and develop custom user-defined functions (UDF)
Required Qualifications:
For Intermediate Candidates:
- A BS or higher in Computer Science or equivalent experience
- Experience with distributed systems
- Experience in developing use cases, functional specs, design specs, ERDs etc.
- Experience working with databases
- At least 2 years experience:
- Distributed computing with Java
- Operating production scale deployments
- With MySQL, PostgreSQL or other open-source distributed databases/key-value stores
- Strong analytical skills
- A solid understanding of computer science (algorithms, data structures etc.)
- Solid experience on project and team management
- Able to work independently as well as in a team. Sometimes an expert for your challenge is in a different time zone. You cant always rely on getting help in a timely fashion
- Strong capability in implementing new and improved data solutions for multi-tenant environments
It would be nice if you had (Intermediate Candidates):
- One additional language. Preferably: Scala, Ruby, or Python
- Experience working with distributed scalable Big Data stores or NoSQL, including HDFS, S3, Cassandra, Big Table, etc.
- Experience with cloud architecture and services in public clouds like AWS, GCP, or Microsoft Azure
- Understanding of the capabilities of Hadoop/Hive or Trino
- Familiar with microservices-based software architecture
- Expertise in Data Integration patterns
- Experience with the development of multiple objectoriented systems
- Good understanding of infrastructure as code and operations
- Proficiency in Japanese (spoken and written)
For Senior Candidates:
- A BS or higher in Computer Science or equivalent experience
- Deep understanding of distributed systems and their challenges
- Solid understanding of cloud architecture and services in public clouds like AWS, GCP, or Microsoft Azure
- Experience in developing use cases, functional specs, design specs, ERDs etc.
- Experience working with distributed scalable Big Data stores or NoSQL, including HDFS, S3, Cassandra, Big Table, etc.
- At least 5 years experience for Senior candidates
- Distributed computing with Java, and at least one of: Scala, Ruby, or Python
- Working with and tuning the JVM
- With distributed massive parallel processing (MPP) engines
- Operating production scale deployments
- With MySQL, PostgreSQL or other open-source distributed databases/key-value stores
- Strong analytical skills
- A solid understanding of computer science (algorithms, data structures etc.)
- Solid experience on project and team management and handling Big Data problems
- Able to work independently as well as in a team
- Strong capability in implementing new and improved data solutions for multi-tenant environments
It would be nice if you had (Senior Candidates):
- Deep understanding of the capabilities of Trino or Hadoop/Hive
- Familiar with microservices-based software architecture
- Expertise in Data Integration patterns
- Strong track record to drive rapid prototyping and design for Big Data
- Experience with extending Free and OpenSource Software (FOSS) or COTS products
- Strong IT & Security skill sets and knowledge
- Experience with the design and development of multiple objectoriented systems
- Good understanding of infrastructure as code and operations
- Proficiency in Japanese (spoken and written)
Physical Requirements:
Must be located in the Greater Vancouver, BC CAN area.
Travel Requirements:
Travel requirements typically amount to approximately 5% of the year, including one week in Japan annually, another in Mountain View, CA, and possibly an additional week elsewhere.
Perks and Benefits (Canada):
Our benefit package showcases our culture of care and empathy with
- Competitive compensation packages
- Restricted Stock Units (RSU)
- Paid vacation and sick time
- Paid volunteer and mental health days
- Up to 26 weeks paid parental leave
- 16 Company holidays (includes 2 floating holidays)
- RRSP with company match
- Employer provided Supplemental medical, dental, disability & life coverage
Our Dedication to You:
We value and promote diversity, equity, inclusion, and belonging in all aspects of our business and at all levels. Success comes from acknowledging, welcoming, and incorporating diverse perspectives.
Diverse representation alone is not the desired outcome. We also strive to create an inclusive culture that encourages growth, ownership of your role, and achieving innovation in new and unique ways. Your voice will be heard, and we will help amplify it.
Agencies and Recruiters:
We cannot consider your candidate(s) without a contract in place. Any resumes received without having an active agreement will be considered gratis referrals to us. Thank you for your understanding and cooperation!
remote work