Remove a lot of elements from very large SharePoint Lists
Lists in SharePoint is only look like database tables but not they are. For some things they are even better than databases – for common users for example, but for developers and administrators they are often not so good and friendly.
Let’s look for such a common thing in databases like "Truncate". To clear all the rows in a database table you need only to write "Truncate TABLE table_name" and execute it.
With SharePoint Lists it’s not so easy.
Some times earlier I had to delete data from list, but it was a list with small quantity of rows, about 5 or 10 thousands. But removing items from that list lasted about 10-15 minutes. It’s too long.
Recently I had to remove items from a list with 100k rows. I couldn’t even think of waiting for so long and started to look for solution.
I found a way of removing SharePoint list items using Batch.
But my task was some more difficult – I had to remove only first 90k rows of 100k. Below I write a powershell script for this action.
But it only moves items to SharePoint Recycle Bin. I will write how to clean deleted items from SharePoint Recycle Bin soon.
#Script to remove items from huge lists
Add-PSSnapin Microsoft.SharePoint.PowerShell -erroraction SilentlyContinue
cls
$web = get-spweb "https://spdev/MySite/"
$listname = "myFriends"
$list = $web.Lists[$listname]
$listItems = $list.ItemCount
Write-Host $listItems
$spQuery = New-Object Microsoft.SharePoint.SPQuery
$spQuery.ViewAttributes = "Scope='Recursive'";
#$perPage - variable to manage volume of deletable items per cycle
$perPage = 100
$spQuery.RowLimit = $perPage
#Count Items in SPList
$totalListItems = $list.ItemCount
Write-host "Total elements: " $listItems
#$minCount - The quantity you want to stay in the SPList
$minCount = 10000
#$cycles - how many SPListItems you need to remove
$cycles = $totalListItems - $minCount
#It's a query to remove from first iserted item, first in first out
$caml = '<OrderBy Override="TRUE"><FieldRef Name="ID" Ascending="True" /></OrderBy>'
$q = 0
while($q -lt $cycles)
{
#Marker to see that script works, mostly for debugging
Write-Host "-----------------------"
Get-Date
$listItems = $list.GetItems($spQuery)
$spQuery.ListItemCollectionPosition = $listItems.ListItemCollectionPosition
$batch = "<?xml version=`"1.0`" encoding=`"UTF-8`"?><Batch>"
$j = 0
for ($j = $perPage-1; $j -gt 0; $j--)
{
$item = $listItems[$j]
$batch += "<Method><SetList Scope=`"Request`">$($list.ID)</SetList>
<SetVar Name=`"ID`">$($item.ID)</SetVar>
<SetVar Name=`"Cmd`">Delete</SetVar>
</Method>"
$q++
if ($j -ge $count) { break }
}
$batch += "</Batch>"
write-host "Sending batch... "
$result = $web.ProcessBatchData($batch)
write-host "Emptying Recycle Bin..."
#If you want to remove all items from RecycleBin after removing from SPList, uncomment the string below.
#But I don't recomment you to clean Recycle because it can contain data from other users and may be they will want to restore them
# $web.RecycleBin.DeleteAll()
}
$web.Dispose()

Pipeline sequence in PowerShell