Official CozyRoc Blog

April 8, 2008

Released CozyRoc SSIS+ 1.2 SR-2

Filed under: Announcement, SSIS — cozyroc @ 2:46 am

Released CozyRoc SSIS+ 1.2 SR-2 (1.2.102.0). This release includes following major changes:

  • Implemented Data Flow Destination support a variable used to signal a waiting thread, data processing has finished.
  • Various fixes in the SSH back-end component.
  • Implemented SFTP task “Get remote files list” action to provide extended remote file information.
  • Implemented option in SFTP task to specify if remote path is a wildcard or not.
  • Fixed SFTP task to provide cleaner errors description.

February 26, 2008

Released CozyRoc SSIS+ 1.2 SR-1

Filed under: Announcement, SSIS — cozyroc @ 12:41 pm

Released CozyRoc SSIS+ 1.2 SR-1 (1.2.101.0). This release includes following major changes:

  •   Implemented new action in the SFTP task for retrieving list of remote files.
  •   Fixed Zip Task. It always created Zip64 archives, even when not needed.
  •   Fixed Database Partitions Task:
    • Create Partition action failed with “Specified cast is not valid.” error message for certain partitioned tables.
    • Insert Partition action failed with “Cannot insert explicit value for identity column in table ‘…’ when IDENTITY_INSERT is set to OFF.” when partitioned table contained identity columns.

The SQL 2008 version is also updated.

February 25, 2008

Custom SSIS component events not working “by design”

Filed under: SSIS — cozyroc @ 11:23 am

Don’t waste time implementing custom data flow component events under SQL Server 2005. There is a bug, which prevents it from working. I just got back response from the SSIS PM the issue is acknowledged, but it will not be fixed in the foreseeable future for SQL Server 2005. To follow the status of this issue, track this report.

January 20, 2008

Released official CozyRoc SSIS+ 1.2

Filed under: Announcement, SSIS — cozyroc @ 6:18 pm

We have great news. We have just released CozyRoc SSIS+ 1.2 library and our newer pricing structure. Greetings to Stacy, George, Bobby and Andy.

December 21, 2007

FTP over SSL in SSIS

Filed under: SSIS, Tips — cozyroc @ 9:15 am

We have implemented SSIS script task sample demonstrating FTP over SSL operations. It depends on FtpWebRequest for FTPS support and Script Task Plus for UI implementation. You may check it here

December 17, 2007

CozyRoc SSIS+ 1.2 Beta 2 Released

Filed under: Announcement, SSIS — cozyroc @ 11:33 pm

We are announcing with great pleasure the release of CozyRoc 1.2 Beta 2. It includes one very cool new component called Data Flow Task Plus. As you may have guessed from the name, it is an extension of the standard Microsoft Data Flow Task. When we started planning for the next version, we determined there are places where SSIS framework can be improved. One of those places was the data flow task, with the inability to reuse the logic in it in other packages. What we determined was needed, sounded daunting at first and believe me, our accomplishment so far is nothing short of amazing. What this new component is now giving you is the ability to componentize and reuse your data flow logic in other packages and lets you keep track of the changes separate from specific package. In addition to that we have implemented a customizable user interface for your data flow, so when you decide to distribute your shiny-dandy data flows to other users, they will not be required to check what is going inside, but just point-and-click the parameters to setup the task for the new job. Now you will probably realize how important were the Data Flow Source and Data Flow Destination components to allow complete abstraction of your data flows from specific package and achieve top performance.

This release includes also the following major changes:

  • SSH Connection Manager component.
    • Fixed connection manager. Sometimes it failed with ‘Value cannot be null’, when working against freeftpd server.
  • Script Task Plus component.
    • Implemented to initialize script from task-scoped SSIS variables. The new approach is better suited for package configuration. ScriptInit property should not be used and is depreciated.

This new version is great news and great holiday gift for powerful SSIS users and we will be glad to hear your opinion about it. Gives us a shout and let us know what you like and what else do you think is needed. The release is available for download here.

November 13, 2007

Implementing custom SSIS Control Flow Task in C++

Filed under: SSIS, Tips — cozyroc @ 11:41 pm

This topic is a work in progress. It will stay this way until Microsoft SSIS product team decides to document their native interfaces for third-parties to use. If you want to develop custom SSIS task in C++, you have to load yourself with patience and be ready to spend some time in the debugger and with Reflector. The undocumented native interfaces for implementing SSIS Control Flow Task can be found by opening the COM type library information in the DTS.dll library. The following interfaces represent roughly the base class Task from which all custom SSIS task must inherit:

There is one more interface IDTSComponentPersist90, which has to be implemented for serializing your task. After you are done implementing these four interfaces, you have to make your task visible to the SSIS framework. This is done by including additional information in the registry for your COM object. The additional registry keys under the specific task CLSID are:

  • DefaultIcon - Must point to an icon resource located in your DLL. Should have the format <your library>.dll,0 where 0 is my guess the index of the icon resource.
  • Programmable - I’m not sure if this is mandatory, but I saw other stock SSIS tasks using it.
  • Implemented Categories - Should contain {CD0DB111-958E-440D-9111-7D70D1490063} subkey. Categorizes your COM object as SSIS task?
  • DTSInfo - Various additional SSIS task metadata.
    • Description - task description.
    • ResourceFile - string resources containing additional task descriptions. Should have the format <your library>,10,20 where 10, 20 are string resource ids.
    • TaskContact - information describing where the SSIS originated from, copyright info, etc.
    • TaskType - Describes task compatibility. For SQL 2005, it should say DTS90.
    • UITypeName - Task user interface type reference.

After you compile and open an SSIS package, you should be able to browse and find your custom SSIS Control Flow Task listed.

p.s.

We haven’t spent time yet developing custom SSIS Data Flow Components, but you may check the type library in the DTSPipeline.dll library.

November 5, 2007

Even Easier SSIS Package Configuration

Filed under: SSIS — cozyroc @ 3:42 am

We all know the standard hard-coded SSIS package configuration types like XML, registry, INI file, SQL Server. Our opinion is that the SSIS package configuration implementation in the SQL 2005 incarnation is half baked, with no way to extend it with other configuration types. We have used for a long time in DTS a different configuration setup based on Excel worksheets. It gives you better maintenability and flexibility because you can use the Excel engine as a powerful initialization model. We have created a small Script Task to demonstrate the concept. You can use it as a sample to create your own configuration mechanisms. Please install SSIS+ 1.2 to use it. Enjoy.

November 4, 2007

Comparing performance of a raw file vs. CozyRoc Data Flow Source and Destination

Filed under: SSIS — cozyroc @ 12:42 am

As we have promised earlier, we have measured the performance of our new components Data Flow Source and Data Flow Destination. We have used the package provided by Jamie Thomson who has done similar test, comparing the performance of raw data file vs. ADO recordset. We have extended the package to include test for our components and another test without intermediate data flow step. The test without intermediate step is important because it establishes the baseline comparison. The current performance cannot be improved beyond the baseline, unless Microsoft makes changes to the SSIS data flow pipeline implementation. Here are the results:

Number of rows Baseline execution time Raw file execution time Data Flow execution time Data Flow vs. Baseline Data Flow vs. Raw
100000 5.531 6.156 6.203 -10.83% -0.76%
200000 7.359 8.5 8.313 -11.48% 2.25%
300000 9.14 10.891 10.438 -12.44% 4.34%
400000 11 13.218 12.625 -12.87% 4.70%
500000 12.765 15.563 14.75 -13.46% 5.51%
600000 14.563 17.797 16.828 -13.46% 5.76%
700000 16.391 20.218 18.875 -13.16% 7.12%
800000 18.282 22.578 21.297 -14.16% 6.01%
1000000 21.781 27.218 25.453 -14.43% 6.93%

Also in chart form:

Data Flow Performance

The performance improvement compared to the raw data file approach is evident. But what is even better is that the Data Flow Source and Data Flow Destination components are implemented to be non-blocking. So if the same test is executed on a dual or more processor machine, it will execute the reading and writing steps in parallel and show even better performance compared to the sequential raw data file approach. Another shortcoming of the raw data file approach is the fact that it depends on the file system and requires space to transfer the data. Some might say these days you have powerful subsystems with advanced caching, which is true of course if you can afford it.

If you want to check the test package, you may download it from here.

November 1, 2007

CozyRoc SSIS+ 1.2 Beta 1 Released

Filed under: Announcement, SSIS — cozyroc @ 8:59 am

It is November 1 and a perfect day to announce Beta 1 of CozyRoc SSIS+ 1.2 library. We have included some really powerful components, which will serve as foundation for other advanced components:

  • Package SSIS Connection Manager - Provides access to the current or different package at runtime.
  • Data Flow Source SSIS Data Flow Component - Very fast, non-blocking, in-memory data read from another data flow.
  • Data Flow Destination SSIS Data Flow Component - Exposes data from a data flow for in-memory read in another data flow.

We will soon blog about the performance of these new components. Our preliminary tests show they match the performance of the raw data file source/destination components, but there is one important difference. Our components do not need to go thru intermediate file write/read step. They are reading/writing in parallel and the process is not sequential as the Microsoft recommended solution for transferring data from one data flow to another. You can try it for yourself by downloading the package.

And of course the new release is available for download here.

Older Posts »

Blog at WordPress.com.