Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does actual-data-nodes work without specifying physical tables explicitly? #32591

Open
good-134 opened this issue Aug 18, 2024 · 3 comments

Comments

@good-134
Copy link

I have observed an unexpected behavior in ShardingSphere's table sharding configuration. My configuration doesn't explicitly specify all physical tables, yet it works correctly. I'm trying to understand why this is possible.

Here's my current configuration and sharding code:

  shardingsphere:
    datasource:
      names: ds0
      ds0:
        type: com.zaxxer.hikari.HikariDataSource
        driver-class-name: com.mysql.cj.jdbc.Driver
        url: jdbc:mysql://127.0.0.1:3306/demo?useUnicode=true&characterEncoding=UTF-8&useSSL=false&serverTimezone=GMT%2B8&allowPublicKeyRetrieval=true
        username: root
        password: '111111'
    rules:
      sharding:
        key-generate:
          snowflake-gen:
            type: SNOWFLAKE
        shardingAlgorithms:
          agreement-table-complex-algorithm:
            type: CLASS_BASED
            props:
              strategy: COMPLEX
              algorithmClassName: biz.demo.config.sharding.AgreementAlgorithm
        tables:
          agreement:
            actual-data-nodes: ds0.agreement
            table-strategy:
              complex:
                sharding-algorithm-name: agreement-table-complex-algorithm
                sharding-columns: id,create_time
    props:
      sql-show: true
public Collection<String> doSharding(Collection<String> availableTargetNames, ComplexKeysShardingValue<Long> complexKeysShardingValue) {

        Map<String, Range<Long>> columnNameAndRangeValuesMap = complexKeysShardingValue.getColumnNameAndRangeValuesMap();

        Map<String, Collection<Long>> columnNameAndShardingValuesMap = complexKeysShardingValue.getColumnNameAndShardingValuesMap();

        List<String> actualTarget = new ArrayList<>();
        Snowflake snowflakeInstances = Snow.getSnowflakeInstances();
        Collection<Long> idKeyColl = columnNameAndShardingValuesMap.get("id");
        String logicTableName = complexKeysShardingValue.getLogicTableName();
        if (CollectionUtils.isNotEmpty(idKeyColl)) {
            for (Long id : idKeyColl) {
                long generateDateTime = snowflakeInstances.getGenerateDateTime(id);
                int suffix = Math.abs(String.valueOf(generateDateTime).hashCode() % 4);
                String tb = logicTableName + "_" + suffix;
                actualTarget.add(tb);
            }
        }
        return actualTarget;
}

In my database, I have four physical tables: agreement_0, agreement_1, agreement_2, and agreement_3. However, in my configuration, I've only specified actual-data-nodes: ds0.agreement instead of actual-data-nodes: ds0.agreement_$->{0..3}.

Despite this, my sharding setup works correctly. The system successfully routes queries to the appropriate physical tables.

My questions are:

  1. Why does this configuration work without explicitly specifying all physical tables?
  2. Is this behavior intentional or a side effect of how ShardingSphere processes configurations?
  3. Are there any potential risks or downsides to using this simplified configuration in a production environment?
  4. Is there any documentation that explains this behavior?

Is there some default behavior in ShardingSphere that I'm not aware of, or is it related to my custom sharding algorithm?
Thank you.

@apache apache locked and limited conversation to collaborators Aug 19, 2024
@apache apache unlocked this conversation Aug 19, 2024
@linghengqian
Copy link
Member

The locking issue was just a mistake of hand. Can you debug the variables actualTarget and availableTargetNames? That sounds kind of weird.

@good-134
Copy link
Author

image

image

I have debugged the variables availableTargetNames and actualTarget as you suggested.
availableTargetNames contains the logical table name agreement.
actualTarget contains a single physical table name agreement_2.
From this, it seems that my custom sharding algorithm is correctly mapping the logical table to the appropriate physical table. However, my original question was about why my configuration works even though I didn't specify the physical table names explicitly in the actual-data-nodes property.
Could it be that ShardingSphere relies on the sharding algorithm to resolve the physical table names from the logical table name, even when the actual-data-nodes doesn't explicitly list all the tables? Is this behavior expected?

Thank you for your help.

@linghengqian
Copy link
Member

Is this behavior expected?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants