掲載日 ・ 2026/02/09

楽天グループ株式会社

楽天グループ株式会社:1030129 Functional Lead, BSS Operations [DevOps & Platform Maintenance] – BSS Ops Department (BSOPD)

非公開
東京都

楽天グループ

インターネットサービス(EC、メディア、アプリ)

PM・プロジェクトマネジメント

会社名

楽天グループ株式会社

会社概要

未来を信じ、より良い明日を創っていく。
イノベーションを通じて、人々と社会をエンパワーメントする。私たちは、そんな想いを大切に世界の人々に喜びと楽しさを届けます。
楽天は、E コマース、FinTech、デジタルコンテンツ、通信など、70 を超えるサービスを展開し、世界10 億以上のユーザーに利用されています。
これら様々なサービスを、楽天会員を中心としたメンバーシップを軸に有機的に結び付け、他にはない独自の「楽天エコシステム」を形成しています。ダイバーシティ推進は、楽天にとって最優先の企業戦略のひとつです。従業員の出身は70カ国・地域以上。世界中からユニークで多様な文化的背景や視点を持つ優秀な人材が集まり、イノベーションの原動力になっています。社内カフェテリアにはベジタリアン、ハラル対応のメニューを用意。礼拝所(Prayer room)もあります。
また、仕事と育児の両立支援や、障がい者雇用・活躍促進も積極的に推進。社内のLGBT(※1)当事者やアライ(※2)に対して、情報共有やサポート体制の強化も進めています。誰もが自分らしく力を最大限発揮して働ける。それが楽天のダイバーシティです。

70を超えるサービスを提供し、世界30カ国にサービス展開拠点を持ち、従業員の出身国・地域数は100を超え、オープンポジション制度を活用して多様なキャリアを描くことができる点も魅力です。
フレックスタイム制度、事情に応じたリモートワークの活用が可能です。本社には託児所やフィットネスジム、三食無料で利用可能なカフェテリアが併設されるなど、社員を支える環境が整備されています。

ポジション

1030129 Functional Lead, BSS Operations [DevOps & Platform Maintenance] - BSS Ops Department (BSOPD)

仕事内容

Job Description:
Business Overview
The Technology Platforms Division (TPD) drives the growth of the Rakuten Ecosystem by delivering innovative, high-quality technology platforms characterized by integrated control and strategic partnerships.

Within TPD, the Telecom Business Application Supervisory Department (TBASD) develops and maintains a unified, high-quality Business Support System (BSS) for Rakuten Mobile. We deliver agile, scalable solutions across the customer lifecycle and continuously enhance system performance through close collaboration with stakeholders.

Department Overview
The Business Support System Operations Department (BSOPD) is responsible for operating a high-quality Business Support System (BSS) that integrates with the broader Rakuten Ecosystem, directly contributing to maximize Rakuten Mobile’s business. These BSS platforms are critical for managing telecommunication business operations. Additionally, we provide excellent customer support and facilitate all BSS integrations.

Position:
Why We Hire
We are looking for Entrepreneurial, Innovative, Growth-Oriented, and Customer-obsessed individuals to join our growing team to build the Telco of the Future.
We are a truly global organization, with team members from Japan, India, North America, South America, Europe, China, Korea, Australia, Africa, and more, shifting to a fast-paced, agile way of working.

Position Details
- Ensure high availability, resilience, and scalability across multi-region production environments through automation and proactive monitoring.
- Design and maintain CI/CD pipelines (Jenkins, GitLab CI, ArgoCD) to enable continuous delivery for microservice and portal components.
- Build and operate observability frameworks (metrics, logs, and traces) using Dynatrace, Grafana, Prometheus, Splunk, and Kibana.
- Develop and enhance infrastructure-as-code templates (Terraform, Ansible) to manage cloud and on-premise resources consistently.
- Participate in the on-call rotation for critical incidents, lead service restoration, and perform detailed Root Cause Analyses (RCA).
- Collaborate with development, product, and network teams to optimize system performance and stability across Rakuten’s digital ecosystem.
- Implement and track SLOs, SLIs, and SLAs for all critical services to improve reliability and align with business objectives.
- Contribute to post-incident reviews, drive automation for recurring issues, and continuously enhance system resilience.
- Create and maintain runbooks, dashboards, and knowledge base documentation for operational readiness and training.
- Support regular maintenance, feature rollouts, and security patching for production and pre-production environments.

求める経験・スキル

Mandatory Qualifications:
1) Technical Expertise
- Cloud Platforms: Extensive hands-on experience with AWS and/or Rakuten Cloud Platform (RCP) services (e.g., EC2, EKS, S3, IAM, VPC, Route 53).
- Containerization & Orchestration: Strong expertise with Docker, Kubernetes (K8s), and Helm for deploying, scaling, and managing distributed, microservice-based applications. Experience with Helm charts, ConfigMaps, and Secrets management.
- Infrastructure as Code (IaC): Proficiency with Terraform, CloudFormation, or Ansible for automated infrastructure provisioning, configuration management, and drift detection.
- CI/CD Automation: Deep knowledge and hands-on experience designing and implementing automated build and deployment pipelines using Jenkins, GitLab CI/CD, and ArgoCD. Familiarity with Git branching strategies, artifact management (Nexus, Artifactory), and code quality gates (SonarQube). Experience with blue-green and canary deployment strategies.
- Monitoring & Observability: Expert-level experience with Dynatrace, Grafana, Prometheus, ELK Stack (Elasticsearch, Logstash, Kibana), Splunk, and/or New Relic for full-stack visibility, metrics collection, alerting, and dashboard creation.
- Logging & Tracing: Skilled in centralized logging and distributed tracing tools such such as Dynatrace, New Relic, AppDynamics, Jaeger, or OpenTelemetry. Strong understanding of end-to-end observability for diagnosing complex issues.
- Scripting & Automation: Strong proficiency in Python, Shell (Bash), or Go for developing automation scripts, health checks, self-healing mechanisms, and reliability tools.
- Operating Systems: Expert in Linux/Unix administration, including performance tuning, troubleshooting, and security hardening.
- Networking & Security: Solid understanding of TCP/IP, DNS, load balancing, TLS/SSL, firewalls, and identity management (e.g., OAuth2, SSO).
- Incident Management: Proven experience in handling P1/P2 incidents, leading service restoration, performing detailed Root Cause Analyses (RCA), and implementing preventive measures.
- Version Control & Collaboration: Proficient in Git, Bitbucket, and agile collaboration tools like JIRA and Confluence.

2) Domain & Methodological Knowledge
- Telecom BSS/OSS Systems: Strong understanding of Rakuten’s customer-facing portals, CRM, order workflows, and the broader telecommunications BSS/OSS landscape.
- Site Reliability Engineering (SRE): Ability to define and monitor SLOs, SLIs, and SLAs to ensure service reliability and uptime targets. Familiarity with SRE best practices (e.g., Google SRE model) and error budget management.
- Hybrid/Multi-Cloud: Experience managing Kubernetes clusters and deploying applications in hybrid cloud or multi-cloud environments (AWS EKS, Rakuten Cloud Platform).
- Cost Optimization & Capacity Planning: Experience with cost optimization strategies and capacity planning in cloud environments.
- IT Governance: Familiarity with ITIL and ISO 27001 standards.

労働条件

雇用形態

正社員

年収

非公開

勤務地

東京都

求人問い合わせ・転職相談

PICK UP

楽天グループの他の求人はこちら

随時開催 まずはキャリア相談会から 簡単1分 無料転職相談を申し込む

リクルートへの転職 元リク面接官が明かす対策方法

リクルートへの転職 元リク面接官が明かす対策方法

中途採用のすべてを元社員がご紹介「リクルートに合格する人材、しない人材」とは?

転職FAQ

転職FAQ

皆様からよく頂く弊社Sincereedのこと、転職のことなどをまとめてみました。