Lasifu Ta

Confluent

  • data processing
  • ai agent
  • developer tool
Introduction

Help customers shift left by bringing processing and cleaning closer to the data source, eliminating wasteful data, manual break-fix, and high costs.

payments
  • Id
    uuid
  • tripID
    varchar
  • method
    enum
  • amount
    float
  • createdAt
    timestamp
sales pre-aggregation
DEV
db_dev
1
2
3
4
CREATE TABLE `DEV.db_dev.daily_sales_summary` (
  product_id BIGINT,
  date DATE,
  total_quantity BIGINT,
1
2
3
4
5
6
7
8
9
10
INSERT INTO `DEV.db_dev.daily_sales_summary`
SELECT 
  product_id,
  DATE_FORMAT(order_date, 'yyyy-MM-dd') AS date
  SUM(quantity) AS total_quantity,
  SUM(quantity * price) AS total_revenue
FROM 
  sales
GROUP BY 
  product_id, DATE_FORMAT(order_date, 'yyyy-MM-dd');
Results: >5k
Start: 2022-12-10 14:42:11
Duration: 820ms
Job Id: qewg-2t1t-xf43
1
2
SELECT * FROM daily_sales_summary LIMIT 10;
Results: 816
Start: 2022-12-10 14:51:32
Duration: 240ms
Job Id: xf43-qewg-2t1t
product_id BIGINT
date DATE
total_quantity BIGINT
total_revenue DECIMAL
101
2023-01-15
2
60
102
2023-01-15
1
20
101
2023-01-16
3
90
103
2023-01-16
4
40
104
2023-01-17
5
125
Stream lineage diagram showing data flow from source topics through transformations to sink
100%
Build data pipeline

SQL Workspace to process data and explore schema — all in one place.

payments
  • Id
    uuid
  • tripID
    varchar
  • method
    enum
  • amount
    float
  • createdAt
    timestamp
1
2
3
4
5
6
7
8
9
10
11
12
13
14
-- Retrieve basic trip details including customer and driver information

SELECT 
	customers.name AS customer_name,
	drivers.name AS driver_name
	trips.start_location,
	trips.end_location,
FROM 
	trips
JOIN 
	customers ON trips.customer_id = customers.customer_id
JOIN 
	drivers ON trips.driver_id = drivers.driver_id;
Results: >5k
Start: 2022-12-10 14:42:11
Duration: 450ms
Job Id: qewg-2t1t-xf43
customer_name varchar
driver_name varchar
start_location point
end_location point
123 Elm St
456 Oak St
John Doe
Jane Smith
789 Pine Ave
101 Maple Dr
Alice Johnson
Bob Brown
202 Birch Blvd
303 Cedar Lane
Michael Green
Sara White
404 Walnut St
505 Chestnut St
Emily Davis
Tom Black
606 Willow Way
707 Spruce St
David Wilson
Anna Lee
808 Poplar Pl
909 Redwood Rd
Sarah Connor
John Connor
123 Oakridge Ct
456 Highland Ave
Frank Castle
Karen Page
789 Summer St
101 Autumn Dr
Peter Parker
Tony Stark
202 Winter Blvd
303 Spring Ln
Bruce Wayne
Alfred Penny
404 Hero Rd
505 Villain Ave
Clark Kent
Lois Lane
606 Gotham St
707 Metropolis Blvd
Diana Prince
Steve Trevor
808 Lantern Dr
909 Arrow Ave
Barry Allen
Oliver Queen
123 Flash Blvd
456 Arrow Ct
Hal Jordan
Carol Ferris
789 Batman St
101 Robin Rd
Arthur Curry
Mera Anderson
202 Wonder Woman Ln
303 Aquaman Blvd
Victor Stone
Cyborg Thomas
Understand data pipeline

Stream lineage and Query profiler make pipelines easier to discover, understand, and debug.

Stream lineage diagram showing data flow from source topics through transformations to sink

Stream lineage maps the pipeline, and the query profiler exposes task-level execution details, giving visibility into how a statement runs. This accelerates detection of bottlenecks, skew, and other performance problems.

Query profiler diagram showing pipeline stages from source through join, filter, calc to sink
Simplify development

Makes SQL easier to write and faster to get started.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
-- Retrieve trip details including customer and driver info
SELECT 
  customers.name AS customer_name,
  drivers.name AS driver_name
  trips.start_location,
  trips.end_location,
FROM 
  trips
JOIN 
  customers ON trips.customer_id = customers.customer_id
JOIN 
  drivers ON trips.driver_id = drivers.driver_id;
CREATE TABLE `DEV.db_dev.table_name` (
  column1 TYPE,
  column2 TYPE,
  ...
)
AS SELECT id, name, email, createAt
FROM `DEV.db_dev.customers`
WHERE ...;

Artifact

cal_geolocation.jar
ID
fo91-lfcp-359zm0
Environment
DEV
Cloud & Region
AWS | us-west-1 (Oregon)

SQL

USE DEV;
USE cluster_rides;

CREATE FUNCTION cal_geolocation AS 
  'com.example.cal_geolocation'
USING JAR
  'confluent-artifact://fo91-lfcp-359zm0';
Agentic AI

Build real-time AI agents.

Agent configurations

Allow agent to access and interact with external systems and data.

Determine the intelligence, speed, and how to handle complex tasks.

How the agent should behave when an error occurs.

Number of tool/step failures allowed before stopping

Maximum number of reasoning/acting loop cycles

Token limit that triggers trimming or summarization

Maximum allowed runtime for each request

Style of summary used when compressing context.

How the runtime reduces tokens when context gets too large.

2024-06-18 17:22:55
Flink actions

Make data processing accessible to a broader audience, not just data experts.

Create embeddings

Transform column data into vector embeddings and append to the topic.

De-duplicate topic

Generate a topic containing only unique records from an input topic.

Mask topic

Generate a topic containing masked fields from an input topic.

Transform topic

Apply custom transformations to the data in a topic.

Join topic

Combine records from multiple input topics to produce a unified output topic.

Filter topic

Select and retain only records that meet specific criteria from an input topic.

Transform topic

Apply custom transformations to the data in a topic.

Action details

customerId
uuid
AS
driverId
uuid
AS
paymentId
uuid
AS
status
enum
AS
source
float
AS
destination
float
AS

Output topic

Flink statement

573e-01c0-a49b

Status
Running
Compute pool
default-pool
Duration
33s
Scaling status
Fine
Statement type
INSERT INTO
Start time
2024-06-21 12:42:33

SQL

INSERT INTO completed_rides
SELECT 
    driver_id, passenger_id, start_location,
    end_location, start_time, end_time, fare
FROM 
    rides_source
WHERE 
    ride_status = 'COMPLETED';

Lineage

TOPICSourcerefresh(streaming)TOPICSink

Message behind

Message in

Message out

Infrastructure

Establish a robust infrastructure that meets the team's unique needs.

AWS
GCP
Azure

GCP.uswest.env-03638o

Running
ID
lfcp-359zm0
Current CFU
32
Max CFU
50
Cloud & Region
GCP | Las Vegas (us-west4)

Account

Execute long-running Flink statements using a service account to enhance security and ensure controlled resource access.

pl-egress-01

Ready
|
gw-0df98fr
|
AWS
|
us-east-2
|
Egress

Network details

DNS domain
us-east2.aws.private.cflt.cloud
Connection type
PrivateLink gateway
PrivateLink ID
com.amazon.aws.vcpe.us-east2.vcpe-sve-09dw08dgdg09sdf8

Access points