WHERE 절에서 변수 사용을 피하는 방법


16

다음과 같은 (단순화 된) 저장 프로 시저가 제공됩니다.

CREATE PROCEDURE WeeklyProc(@endDate DATE)
AS
BEGIN
  DECLARE @startDate DATE = DATEADD(DAY, -6, @endDate)
  SELECT
    -- Stuff
  FROM Sale
  WHERE SaleDate BETWEEN @startDate AND @endDate
END

만약 Sale 테이블이 큰 경우는 SELECT, 실행하는 데 시간이 오래 걸릴 수 있습니다 분명히 있기 때문에 최적화 할 수없는 최적화 로컬 변수로 인해. 우리는 SELECT변수로 하드 코딩 된 날짜로 부품을 테스트 했으며 실행 시간은 ~ 9 분에서 ~ 1 초로 단축되었습니다.

"고정 된"날짜 범위 (주, 월, 8 주 등)를 기반으로 쿼리하는 수많은 저장 프로 시저가 있으므로 입력 매개 변수는 @endDate이고 @startDate는 프로 시저 내에서 계산됩니다.

문제는 옵티 마이저를 손상시키지 않기 위해 WHERE 절에서 변수를 피하는 가장 좋은 방법은 무엇입니까?

우리가 생각해 낸 가능성은 다음과 같습니다. 이러한 모범 사례가 있습니까? 아니면 다른 방법이 있습니까?

랩퍼 프로 시저를 사용하여 변수를 매개 변수로 바꾸십시오.

파라미터는 로컬 변수와 같은 방식으로 옵티 마이저에 영향을 미치지 않습니다.

CREATE PROCEDURE WeeklyProc(@endDate DATE)
AS
BEGIN
   DECLARE @startDate DATE = DATEADD(DAY, -6, @endDate)
   EXECUTE DateRangeProc @startDate, @endDate
END

CREATE PROCEDURE DateRangeProc(@startDate DATE, @endDate DATE)
AS
BEGIN
  SELECT
    -- Stuff
  FROM Sale
  WHERE SaleDate BETWEEN @startDate AND @endDate
END

매개 변수화 된 동적 SQL을 사용하십시오.

CREATE PROCEDURE WeeklyProc(@endDate DATE)
AS
BEGIN
  DECLARE @startDate DATE = DATEADD(DAY, -6, @endDate)
  DECLARE @sql NVARCHAR(4000) = N'
    SELECT
      -- Stuff
    FROM Sale
    WHERE SaleDate BETWEEN @startDate AND @endDate
  '
  DECLARE @param NVARCHAR(4000) = N'@startDate DATE, @endDate DATE'
  EXECUTE sp_executesql @sql, @param, @startDate = @startDate, @endDate = @endDate
END

"하드 코딩 된"동적 SQL을 사용하십시오.

CREATE PROCEDURE WeeklyProc(@endDate DATE)
AS
BEGIN
  DECLARE @startDate DATE = DATEADD(DAY, -6, @endDate)
  DECLARE @sql NVARCHAR(4000) = N'
    SELECT
      -- Stuff
    FROM Sale
    WHERE SaleDate BETWEEN @startDate AND @endDate
  '
  SET @sql = REPLACE(@sql, '@startDate', CONVERT(NCHAR(10), @startDate, 126))
  SET @sql = REPLACE(@sql, '@endDate', CONVERT(NCHAR(10), @endDate, 126))
  EXECUTE sp_executesql @sql
END

DATEADD()기능을 직접 사용하십시오 .

WHERE에서 함수를 호출해도 성능에 영향을 미치기 때문에 이에 관심이 없습니다.

CREATE PROCEDURE WeeklyProc(@endDate DATE)
AS
BEGIN
  SELECT
    -- Stuff
  FROM Sale
  WHERE SaleDate BETWEEN DATEADD(DAY, -6, @endDate) AND @endDate
END

선택적 매개 변수를 사용하십시오.

매개 변수에 할당하는 것이 변수에 할당하는 것과 같은 문제가 있는지 확실하지 않으므로 옵션이 아닐 수 있습니다. 나는이 솔루션을 정말로 좋아하지 않지만 완전성을 위해 그것을 포함합니다.

CREATE PROCEDURE WeeklyProc(@endDate DATE, @startDate DATE = NULL)
AS
BEGIN
  SET @startDate = DATEADD(DAY, -6, @endDate)
  SELECT
    -- Stuff
  FROM Sale
  WHERE SaleDate BETWEEN @startDate AND @endDate
END

-업데이트-

제안과 의견에 감사드립니다. 그것들을 읽은 후 다양한 접근법으로 타이밍 테스트를 실행했습니다. 여기에 결과를 참조로 추가하고 있습니다.

실행 1은 계획이 없습니다. 실행 2는 정확히 동일한 매개 변수를 사용하여 실행 1 직후에 실행되므로 실행 1의 계획을 사용합니다.

The NoProc times are for running the SELECT queries manually in SSMS outside a stored procedure.

TestProc1-7 are the queries from the original question.

TestProcA-B are based on the suggestion by Mikael Eriksson. The column in the database is a DATE so I tried passing the parameter as a DATETIME and running with implicit casting (testProcA) and explicit casting (testProcB).

TestProcC-D are based on the suggestion by Kenneth Fisher. We already use a date lookup table for other things, but we don't have one with a specific column for each period range. The variation I tried still uses BETWEEN but does it on the smaller lookup table and joins to the larger table. I'm going to investigate further as to whether we can use specific lookup tables, although our periods are fixed there are quite a few different ones.

    Total rows in Sale table: 136,424,366

                       Run 1 (ms)     Run 2 (ms)
    Procedure          CPU   Elapsed  CPU    Elapsed  Comment
    NoProc constants   6567  62199    2870   719      Manual query with constants
    NoProc variables   9314  62424    3993   998      Manual query with variables
    testProc1          6801  62919    2871   736      Hard coded range
    testProc2          8955  63190    3915   979      Parameter and variable range
    testProc3          8985  63152    3932   987      Wrapper procedure with parameter range
    testProc4          9142  63939    3931   977      Parameterized dynamic SQL
    testProc5          7269  62933    2933   728      Hard coded dynamic SQL
    testProc6          9266  63421    3915   984      Use DATEADD on DATE
    testProc7          2044  13950    1092  1087      Dummy parameter
    testProcA         12120  61493    5491  1875      Use DATEADD on DATETIME without CAST
    testProcB          8612  61949    3932   978      Use DATEADD on DATETIME with CAST
    testProcC          8861  61651    3917   993      Use lookup table, Sale first
    testProcD          8625  61740    3994  1031      Use lookup table, Sale last

Here's the test code.

------ SETUP ------

IF OBJECT_ID(N'testDimDate', N'U') IS NOT NULL DROP TABLE testDimDate
IF OBJECT_ID(N'testProc1', N'P') IS NOT NULL DROP PROCEDURE testProc1
IF OBJECT_ID(N'testProc2', N'P') IS NOT NULL DROP PROCEDURE testProc2
IF OBJECT_ID(N'testProc3', N'P') IS NOT NULL DROP PROCEDURE testProc3
IF OBJECT_ID(N'testProc3a', N'P') IS NOT NULL DROP PROCEDURE testProc3a
IF OBJECT_ID(N'testProc4', N'P') IS NOT NULL DROP PROCEDURE testProc4
IF OBJECT_ID(N'testProc5', N'P') IS NOT NULL DROP PROCEDURE testProc5
IF OBJECT_ID(N'testProc6', N'P') IS NOT NULL DROP PROCEDURE testProc6
IF OBJECT_ID(N'testProc7', N'P') IS NOT NULL DROP PROCEDURE testProc7
IF OBJECT_ID(N'testProcA', N'P') IS NOT NULL DROP PROCEDURE testProcA
IF OBJECT_ID(N'testProcB', N'P') IS NOT NULL DROP PROCEDURE testProcB
IF OBJECT_ID(N'testProcC', N'P') IS NOT NULL DROP PROCEDURE testProcC
IF OBJECT_ID(N'testProcD', N'P') IS NOT NULL DROP PROCEDURE testProcD
GO

CREATE TABLE testDimDate
(
   DateKey DATE NOT NULL,
   CONSTRAINT PK_DimDate_DateKey UNIQUE NONCLUSTERED (DateKey ASC)
)
GO

DECLARE @dateTimeStart DATETIME = '2000-01-01'
DECLARE @dateTimeEnd DATETIME = '2100-01-01'
;WITH CTE AS
(
   --Anchor member defined
   SELECT @dateTimeStart FullDate
   UNION ALL
   --Recursive member defined referencing CTE
   SELECT FullDate + 1 FROM CTE WHERE FullDate + 1 <= @dateTimeEnd
)
SELECT
   CAST(FullDate AS DATE) AS DateKey
INTO #DimDate
FROM CTE
OPTION (MAXRECURSION 0)

INSERT INTO testDimDate (DateKey)
SELECT DateKey FROM #DimDate ORDER BY DateKey ASC

DROP TABLE #DimDate
GO

-- Hard coded date range.
CREATE PROCEDURE testProc1 AS
BEGIN
   SET NOCOUNT ON
   SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN '2012-12-09' AND '2012-12-10'
END
GO

-- Parameter and variable date range.
CREATE PROCEDURE testProc2(@endDate DATE) AS
BEGIN
   SET NOCOUNT ON
   DECLARE @startDate DATE = DATEADD(DAY, -1, @endDate)
   SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN @startDate AND @endDate
END
GO

-- Parameter date range.
CREATE PROCEDURE testProc3a(@startDate DATE, @endDate DATE) AS
BEGIN
   SET NOCOUNT ON
   SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN @startDate AND @endDate
END
GO

-- Wrapper procedure.
CREATE PROCEDURE testProc3(@endDate DATE) AS
BEGIN
   SET NOCOUNT ON
   DECLARE @startDate DATE = DATEADD(DAY, -1, @endDate)
   EXEC testProc3a @startDate, @endDate
END
GO

-- Parameterized dynamic SQL.
CREATE PROCEDURE testProc4(@endDate DATE) AS
BEGIN
   SET NOCOUNT ON
   DECLARE @startDate DATE = DATEADD(DAY, -1, @endDate)
   DECLARE @sql NVARCHAR(4000) = N'SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN @startDate AND @endDate'
   DECLARE @param NVARCHAR(4000) = N'@startDate DATE, @endDate DATE'
   EXEC sp_executesql @sql, @param, @startDate = @startDate, @endDate = @endDate
END
GO

-- Hard coded dynamic SQL.
CREATE PROCEDURE testProc5(@endDate DATE) AS
BEGIN
   SET NOCOUNT ON
   DECLARE @startDate DATE = DATEADD(DAY, -1, @endDate)
   DECLARE @sql NVARCHAR(4000) = N'SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN ''@startDate'' AND ''@endDate'''
   SET @sql = REPLACE(@sql, '@startDate', CONVERT(NCHAR(10), @startDate, 126))
   SET @sql = REPLACE(@sql, '@endDate', CONVERT(NCHAR(10), @endDate, 126))
   EXEC sp_executesql @sql
END
GO

-- Explicitly use DATEADD on a DATE.
CREATE PROCEDURE testProc6(@endDate DATE) AS
BEGIN
   SET NOCOUNT ON
   SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN DATEADD(DAY, -1, @endDate) AND @endDate
END
GO

-- Dummy parameter.
CREATE PROCEDURE testProc7(@endDate DATE, @startDate DATE = NULL) AS
BEGIN
   SET NOCOUNT ON
   SET @startDate = DATEADD(DAY, -1, @endDate)
   SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN @startDate AND @endDate
END
GO

-- Explicitly use DATEADD on a DATETIME with implicit CAST for comparison with SaleDate.
-- Based on the answer from Mikael Eriksson.
CREATE PROCEDURE testProcA(@endDateTime DATETIME) AS
BEGIN
   SET NOCOUNT ON
   SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN DATEADD(DAY, -1, @endDateTime) AND @endDateTime
END
GO

-- Explicitly use DATEADD on a DATETIME but CAST to DATE for comparison with SaleDate.
-- Based on the answer from Mikael Eriksson.
CREATE PROCEDURE testProcB(@endDateTime DATETIME) AS
BEGIN
   SET NOCOUNT ON
   SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN CAST(DATEADD(DAY, -1, @endDateTime) AS DATE) AND CAST(@endDateTime AS DATE)
END
GO

-- Use a date lookup table, Sale first.
-- Based on the answer from Kenneth Fisher.
CREATE PROCEDURE testProcC(@endDate DATE) AS
BEGIN
   SET NOCOUNT ON
   DECLARE @startDate DATE = DATEADD(DAY, -1, @endDate)
   SELECT SUM(Value) FROM Sale J INNER JOIN testDimDate D ON D.DateKey = J.SaleDate WHERE D.DateKey BETWEEN @startDate AND @endDate
END
GO

-- Use a date lookup table, Sale last.
-- Based on the answer from Kenneth Fisher.
CREATE PROCEDURE testProcD(@endDate DATE) AS
BEGIN
   SET NOCOUNT ON
   DECLARE @startDate DATE = DATEADD(DAY, -1, @endDate)
   SELECT SUM(Value) FROM testDimDate D INNER JOIN Sale J ON J.SaleDate = D.DateKey WHERE D.DateKey BETWEEN @startDate AND @endDate
END
GO

------ TEST ------

SET STATISTICS TIME OFF

DECLARE @endDate DATE = '2012-12-10'
DECLARE @startDate DATE = DATEADD(DAY, -1, @endDate)

DBCC FREEPROCCACHE WITH NO_INFOMSGS
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS

RAISERROR('Run 1: NoProc with constants', 0, 0) WITH NOWAIT
SET STATISTICS TIME ON
SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN '2012-12-09' AND '2012-12-10'
SET STATISTICS TIME OFF

RAISERROR('Run 2: NoProc with constants', 0, 0) WITH NOWAIT
SET STATISTICS TIME ON
SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN '2012-12-09' AND '2012-12-10'
SET STATISTICS TIME OFF

DBCC FREEPROCCACHE WITH NO_INFOMSGS
DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS

RAISERROR('Run 1: NoProc with variables', 0, 0) WITH NOWAIT
SET STATISTICS TIME ON
SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN @startDate AND @endDate
SET STATISTICS TIME OFF

RAISERROR('Run 2: NoProc with variables', 0, 0) WITH NOWAIT
SET STATISTICS TIME ON
SELECT SUM(Value) FROM Sale WHERE SaleDate BETWEEN @startDate AND @endDate
SET STATISTICS TIME OFF

DECLARE @sql NVARCHAR(4000)

DECLARE _cursor CURSOR LOCAL FAST_FORWARD FOR
   SELECT
      procedures.name,
      procedures.object_id
   FROM sys.procedures
   WHERE procedures.name LIKE 'testProc_'
   ORDER BY procedures.name ASC

OPEN _cursor

DECLARE @name SYSNAME
DECLARE @object_id INT

FETCH NEXT FROM _cursor INTO @name, @object_id
WHILE @@FETCH_STATUS = 0
BEGIN
   SET @sql = CASE (SELECT COUNT(*) FROM sys.parameters WHERE object_id = @object_id)
      WHEN 0 THEN @name
      WHEN 1 THEN @name + ' ''@endDate'''
      WHEN 2 THEN @name + ' ''@startDate'', ''@endDate'''
   END

   SET @sql = REPLACE(@sql, '@name', @name)
   SET @sql = REPLACE(@sql, '@startDate', CONVERT(NVARCHAR(10), @startDate, 126))
   SET @sql = REPLACE(@sql, '@endDate', CONVERT(NVARCHAR(10), @endDate, 126))

   DBCC FREEPROCCACHE WITH NO_INFOMSGS
   DBCC DROPCLEANBUFFERS WITH NO_INFOMSGS

   RAISERROR('Run 1: %s', 0, 0, @sql) WITH NOWAIT
   SET STATISTICS TIME ON
   EXEC sp_executesql @sql
   SET STATISTICS TIME OFF

   RAISERROR('Run 2: %s', 0, 0, @sql) WITH NOWAIT
   SET STATISTICS TIME ON
   EXEC sp_executesql @sql
   SET STATISTICS TIME OFF

   FETCH NEXT FROM _cursor INTO @name, @object_id
END

CLOSE _cursor
DEALLOCATE _cursor

답변:


9

Parameter sniffing is your friend almost all of the time and you should write your queries so that it can be used. Parameter sniffing helps building the plan for you using the parameter values available when the query is compiled. The dark side of parameter sniffing is when the values used when compiling the query is not optimal for the queries to come.

The query in a stored procedure is compiled when the stored procedure is executed, not when the query is executed so the values that SQL Server has to deal with here...

CREATE PROCEDURE WeeklyProc(@endDate DATE)
AS
BEGIN
  DECLARE @startDate DATE = DATEADD(DAY, -6, @endDate)
  SELECT
    -- Stuff
  FROM Sale
  WHERE SaleDate BETWEEN @startDate AND @endDate
END

is a known value for @endDate and an unknown value for @startDate. That will leave SQL Server to guessing on 30% of the rows returned for the filter on @startDate combined with whatever the statistics tells it for @endDate. If you have a big table with a lot of rows that could give you a scan operation where you would benefit most from a seek.

Your wrapper procedure solution makes sure that SQL Server sees the values when DateRangeProc is compiled so it can use known values for both @endDate and @startDate.

Both your dynamic queries leads to the same thing, the values are known at compile-time.

The one with a default null value is a bit special. The values known to SQL Server at compile-time is a known value for @endDate and null for @startDate. Using a null in a between will give you 0 rows but SQL Server always guess at 1 in those cases. That might be a good thing in this case but if you call the stored procedure with a large date interval where a scan would have been the best choice it may end up doing a bunch of seeks.

I left "Use the DATEADD() function directly" to the end of this answer because it is the one I would use and there is something strange with it as well.

First off, SQL Server does not call the function multiple times when it is used in the where clause. DATEADD is considered runtime constant.

And I would think that DATEADD is evaluated when the query is compiled so that you would get a good estimate on the number of rows returned. But it is not so in this case.
SQL Server estimates based on the value in the parameter regardless of what you do with DATEADD (tested on SQL Server 2012) so in your case the estimate will be the number of rows that is registered on @endDate. Why it does that I don't know but it has to do with the use of the datatype DATE. Shift to DATETIME in the stored procedure and the table and the estimate will be accurate, meaning that DATEADD is considered at compile time for DATETIME not for DATE.

So to summarize this rather lengthy answer I would recommend the wrapper procedure solution. It will always allow SQL Server to use the values provided when compiling the the query without the hassle of using dynamic SQL.

PS:

In comments you got two suggestions.

OPTION (OPTIMIZE FOR UNKNOWN) will give you an estimate of 9% of rows returned and OPTION (RECOMPILE) will make SQL Server see the parameter values since the query is recompiled every time.


3

Ok, I have two possible solutions for you.

First I'm wondering if this will allow for increased parameterization. I haven't had a chance to test it out but it might work.

CREATE PROCEDURE WeeklyProc(@endDate DATE, @startDate DATE)
AS
BEGIN
  IF @startDate IS NULL
    SET @startDate = DATEADD(DAY, -6, @endDate)
  SELECT
    -- Stuff
  FROM Sale
  WHERE SaleDate BETWEEN @startDate AND @endDate
END

The other option takes advantage of the fact you are using fixed time frames. First create a DateLookup table. Something like this

CurrentDate    8WeekStartDate    8WeekEndDate    etc

Fill it in for every date between now and the next century. This is only ~36500 rows so a fairly small table. Then change your query like this

IF @Range = '8WeekRange' 
    SELECT
      -- Stuff
    FROM Sale
    JOIN DateLookup
        ON SaleDate BETWEEN [8WeekStartDate] AND [8WeekEndDate]
    WHERE DateLookup.CurrentDate = GetDate()

Obviously this is just an example and could certainly be written better but I've had a lot of luck with this type of table. Particularly since it is a static table and can be indexed like crazy.

당사 사이트를 사용함과 동시에 당사의 쿠키 정책개인정보 보호정책을 읽고 이해하였음을 인정하는 것으로 간주합니다.
Licensed under cc by-sa 3.0 with attribution required.