General Questions

What is Oracle Cloud Infrastructure Streaming?

Oracle Cloud Infrastructure (OCI) Streaming provides a fully managed, scalable, and durable messaging solution for ingesting continuous, high-volume streams of data that you can consume and process in real-time. Streaming is available in all supported Oracle Cloud Infrastructure regions. For a list, visit the Regions and Availability Domains page.

Why should I use Streaming?

Streaming is a serverless service that offloads the infrastructure management ranging from networking to storage and the configuration needed to stream your data. You do not have to worry about the provisioning of infrastructure, ongoing maintenance, or security patching. The Streaming service synchronously replicates data across three Availability Domains, providing high availability and data durability. In regions with a single Availability Domain, the data is replicated across three Fault Domains.

How can I use Streaming?

Streaming makes it easy to collect, store, and process data generated in real -time from hundreds of sources. The number of use cases is nearly unlimited, ranging from messaging to complex data stream processing. Following are some of the many possible uses for Streaming:

  • Messaging: Use Streaming to decouple the components of large systems. Producers and consumers can use Streaming as an asynchronous message bus and act independently and at their own pace.
  • Metric and log ingestion: Use Streaming as an alternative for traditional file-scraping approaches to help make critical operational data quickly available for indexing, analysis, and visualization.
  • Web or mobile activity data ingestion: Use Streaming for capturing activity from websites or mobile apps, such as page views, searches, or other user actions. You can use this information for real-time monitoring and analytics, and in data warehousing systems for offline processing and reporting.
  • Infrastructure and apps event processing: Use Streaming as a unified entry point for cloud components to report their lifecycle events for audit, accounting, and related activities.

How do I get started with Streaming?

You can start using Streaming as follows:

  1. Create a stream by using the Oracle Cloud Infrastructure Console or the CreateStream API operation.
  2. Configure producers to publish messages to the stream. See Publishing Messages.
  3. Build consumers to read and process data from the stream. See Consuming Messages.

Alternately, you can also use Kafka APIs to produce and consume from a stream. For more information refer to Using Streaming with Apache Kafka.

What are the service limits of Streaming?

The throughput of Streaming is designed to scale up without limits by adding partitions to a stream. However, there are certain limits to keep in mind while using Streaming:

  • The maximum retention period for messages in a stream is seven days.
  • The maximum size of a unique message that can be produced to a stream is 1 megabyte (MB).
  • Each partition can handle up to 1 MB per second of throughput with any number of requests for writes.
  • Each partition can support a maximum total data write rate of 1 MB per second and a read rate of 2 MB per second.

How does Streaming compare to a queue-based service?

Streaming provides stream-based semantics. Stream semantics provide strict ordering guarantees per partition, message replayability, client-side cursors, and massive horizontal scale of throughput. Queues do not offer these features. Queues can be designed to provide ordering guarantees if using FIFO queues, but only at the cost of adding significant overhead in performance.

Key Concepts

What is a stream?

A stream is a partitioned, append-only log of messages, to which producer applications write data to and from which consumer applications read data.

What is a stream pool?

A stream pool is a grouping that you can use to organize and manage streams. Stream pools provide operational ease by providing an ability to share configuration settings across multiple streams. For example, users can share security settings like custom encryption keys on the stream pool to encrypt the data of all the streams inside the pool. A stream pool also enables you to create a private endpoint for streams by restricting internet access to all of the streams within a stream pool. For customers using Streaming's Kafka compatibility feature, the stream pool serves as the root of a virtual Kafka cluster, thereby enabling every action on that virtual cluster to be scoped to that stream pool.

What is a partition?

A partition is a base throughput unit that enables horizontal scale and parallelism of production and consumption from a stream. A partition provides a capacity of 1 MB/sec data input and 2 MB/sec data output. When you create a stream, you specify the number of partitions you need based on the throughput requirements of your application. For example, you can create a stream with 10 partitions, in which case you can achieve a throughput of 10 MB/sec input and 20 MB/sec output from a stream.

What is a message?

A message is a base64-encoded unit of data stored in a stream. The maximum size of a message you can produce to a partition in a stream is 1 MB.

What is a key?

A key is an identifier used to group related messages. Messages with the same key are written to the same partition. Streaming ensures that any consumer of a given partition will always read that partition's messages in exactly the same order as they were written.

What is a producer?

A producer is a client application that can write messages to a stream.

What is a consumer and a consumer group?

A consumer is a client application that can read messages from one or more streams. A consumer group is a set of instances which coordinates messages from all of the partitions in a stream. At any given time, the messages from a specific partition can only be consumed by a single consumer in the group.

What is a cursor?

A cursor is a pointer to a location in a stream. This location could be a pointer to a specific offset or time in a partition, or to a group's current location.

What is an offset?

Each message within a partition has an identifier called offset. Consumers can read messages starting from a specific offset and are allowed to read from any offset point they choose. Consumers can also commit the latest processed offset so they can resume their work without replaying or missing a message if they stop and then restart.

Security

How secure is my data when I am using Oracle Cloud Infrastructure Streaming?

Streaming provides data encryption by default, both at rest and in transit. Streaming is fully integrated with Oracle Cloud Infrastructure Identity and Access Management (IAM), which lets you use access policies to selectively grant permissions to users and groups of users. While using REST APIs, you can also securely PUT and GET your data from Streaming through SSL endpoints with HTTPS protocol. Further, Streaming provides complete tenant-level isolation of data without any "noisy neighbor" problems.

Can I use my own set of master keys to encrypt the data in streams?

Streaming data is encrypted both at rest and in transit, along with ensuring message integrity. You can let Oracle manage encryption, or use Oracle Cloud Infrastructure Vault to securely store and manage your own encryption keys if you need to meet specific compliance or security standards.

What security settings of a stream pool can I edit after its creation?

You can edit the stream pool's data encryption settings at any time if you would like to switch between using "Encryption provided by Oracle Keys" and "Encryption managed using Customer Managed Keys". Streaming does not impose any restrictions on how many times this activity can be performed.

How do I manage and control access to my stream?

Streaming is fully integrated with Oracle Cloud Infrastructure IAM. Every stream has a compartment assigned. Users can specify role-based access control policies that may be used to describe fine-grained rules at a tenancy, compartment, or single-stream level.

Access policy is specified in a form of "Allow <subject> to <verb> <resource-type> in <location> where <conditions>".

What authentication mechanism do Kafka users have to use with Streaming?

Authentication with the Kafka protocol uses auth tokens and the SASL/PLAIN mechanism. You can generate tokens on the console user details page. See Working with Auth Tokens for more information. We recommend you create a dedicated group/user and grant that group the permission to manage streams in the appropriate compartment or tenancy. You then can generate an auth token for the user you created and use it in your Kafka client configuration.

Can I privately access Streaming APIs from my Virtual Cloud Network (VCN) without using public IPs?

Private endpoints restrict access to a specified virtual cloud network (VCN) within your tenancy so that its streams cannot be accessed through the internet. Private endpoints associate a private IP address within a VCN to the stream pool, allowing Streaming traffic to avoid traversing the internet. To create a private endpoint for Streaming, you need access to a VCN with a private subnet when you create the stream pool. See About Private Endpoints and VCNs and Subnets for more information.

Integrations

How do I use Oracle Cloud Infrastructure Streaming with Oracle Cloud Infrastructure Object Storage?

You can write the contents of a stream directly to an Object Storage bucket, typically to persist the data in the stream for long term storage. This can be achieved using Kafka Connect for S3 with Streaming. For more information, see the Publishing To Object Storage From Oracle Streaming Service blog post.

How do I use Streaming with Oracle Autonomous AI Database?

You can ingest data from a table in an Oracle Autonomous AI Transaction Processing instance. For more information, see the Using Kafka Connect With Oracle Streaming Service And Autonomous DB blog post.

How do I use Streaming with Micronaut?

You can use the Kafka SDKs to produce and consume messages from Streaming, and you can use Micronaut's built-in support for Kafka. For more information, see the Easy Messaging With Micronaut's Kafka Support And Oracle Streaming Service blog post.

How do I use Streaming to ingest IoT data from MQTT brokers?

For information, see the Ingest IoT Data from MQTT Brokers into OCI-Oracle Streaming Service, OCI- Kafka Connect Harness, and Oracle Kubernetes Engine blog post.

Is Oracle GoldenGate for Big Data compatible with Streaming?

Oracle GoldenGate for Big Data is now certified to integrate with Streaming. For more information, see Connecting to Oracle Streaming Service in the Oracle GoldenGate for Big Data documentation.

Is there a way to ingest data directly from Streaming into Oracle Autonomous AI Lakehouse?

You need to use Kafka JDBC Sink Connect to directly transport streaming data into Oracle Autonomous AI Lakehouse.

Pricing

How am I charged for using Oracle Cloud Infrastructure Streaming?

Streaming uses simple pay-as-you-use pricing, which ensures you only pay for the resources you use. The pricing dimensions include

  • GET/PUT request price: Gigabytes of data transferred
  • Price for storage (based on retention period hours used): Gigabytes of storage per hour

Please refer to the OCI Streaming page for the latest pricing information.

Will I be charged for provisioning even if I don't use the service?

Streaming’s industry-leading pricing model ensures that you pay only when you use the service within the default service limits.

Is there an additional charge for moving data in and out of Streaming?

Streaming does not charge an additional price for moving data in and out of the service. Further, users can leverage the power of Service Connector Hub to move data to and from Streaming in a serverless manner at no additional price.

Is there a free tier for Streaming?

Streaming currently doesn't operate in the free tier.

Managing Oracle Cloud Infrastructure Streams

What IAM permissions do I need to access Streaming?

Identity and Access Management lets you control who has access to your cloud resources. To use Oracle Cloud Infrastructure resources, you must be given the required type of access in a policy written by an administrator, whether you're using the Console or the REST API with an SDK, CLI, or other tools. Access policy is specified in the form of


Allow <subject> to <verb> <resource-type> in <location> where <conditions>

Administrators of a tenancy can use the policy


Allow group StreamAdmins to manage streams in tenancy

which lets a specified group StreamAdmins do everything with streaming ranging from creating, updating, listing, and deleting streams and their related resources. However, you can always specify more granular policies so that only select users in a group are eligible for only a subset of activities they can perform on a given stream. If you're new to policies, see Getting Started with Policies and Common Policies. If you want to dig deeper into writing policies for Streaming, see Details for the Streaming Service in the IAM policy reference.

How can I automate the deployment of streams at scale?

You can provision a stream and all its associated components like IAM policies, partitions, encryption settings, etc., using the Oracle Cloud infrastructure Resource Manager or Terraform provider for Oracle Cloud Infrastructure. For information on the Terraform provider, see